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(57) Abstract 

The present invention relates to gene therapy, especially to adenovirus-bascd gene therapy, and related cell lines and compositions. 
In particular, novel packaging cell lines are disclosed, for use in facilitating the development of high-capacity vectors. The invention also 
discloses a variety of high-capacity adenovirus vectors and related compositions and kits including the disclosed cell lines and vectors. 
Finally, the invention discloses methods of preparing and using the disclosed vectors, cell lines and kits. 
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PACKAGING CELL LINES FOR USE IN FACILITATING THE DEVELOPMENT OF HIGH-CAPACITY 

ADENOVIRAL VECTORS 

This invention was made with U.S government support under NIH Grant No. HL 54352. The 
government has certain rights in the invention. 

The present invention relates to gene therapy, especially to adenovirus-based gene 
therapy. In particular, novel packaging cell lines are disclosed, for use in facilitating the 
development of high-capacity vectors. High-capacity adenovirus vectors are also disclosed 
herein, as are related compositions, kits, and methods of preparation and use of the disclosed 

vectors, cell lines and kits. 

Enhanced transfer of DNA conjugates into cells has been achieved with adenovirus, a 
human DNA virus which readily infects epithelial cells (Horwitz, "Adenoviridae and their 
replication", in Virology . Fields and Knipe, eds., Raven Press, NY (1990) pp. 1679-1740). 

Although adenovinis-mediated gene therapy represents an improved method of DNA 
transfer into cells, a potential limitation of this approach is that adenovirus replication results in 
disruption of the host cell. In addition, adenovirus also possesses oncogenic properties 
including the ability of one of its proteins to bind to tumor suppressor gene products. The use 
of so-called replication defective strains of adenovirus (which typically possess El A and/or 
E1B deletions that render the vims unable to replicate in host cells) is in principle more suitable 
for in vivo therapy; however, the potential of co-infection of epithelial cells with wild-type 
strains of virus resulting in transactivation of the recombinant virus may represent a significant 
safety concern for in vivo applications. Furthermore, it is not yet known which recombinant 
adenoviruses are capable of integrating their genome into host cell DNA allowing for 
long-term stable expression of any foreign genes they may be transporting. 

Another undesirable aspect of using intact or replication-competent adenovirus as a 
gene transfer means is that it is an oncogenic virus whose gene products are known to interfere 
with the function of host cell tumor suppressor proteins as well as immune recognition 
molecules, such as the major histocompatibility complex (MHC). In addition, pre-existing 
circulating antibodies to adenovirus may significantly reduce the efficiency of in vivo gene 
delivery. Lastly, only a foreign gene of 6 kilobases (kb) or less can be incorporated into the 
intact adenovirus genome for gene transfer experiments, whereas DNA segments of greater 
than 15 kb can be transferred using the methods of this invention. 
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invention may also be engineered to target the receptors of and achieve penetration of non- 
epithelial cells; means of engineering viral vectors to accomplish these ends are described in 
detail hereinbelow. 

Second, the vectors of the present invention have deletions of substantial portions of 
the Ad genome, which not only limits the ability of the Ad-derived vectors to "spread" to other 
host cells or tissues, but allows significant amounts of "foreign" (or non-native) nucleic acids 
to be incorporated into the viral genome without interfering with the reproduction and 
packaging of the viral genome. Therefore, the vectors of the present invention are ideal for use 
in a wide variety of therapeutic applications. 

Third, while the vectors disclosed herein are safe for use as therapeutic agents in the 
treatment of a variety of human afflictions, they do not require the presence of any "helpers" 
for propagation and packaging, largely because of the novel cell lines in which they are 
reproduced. Such cell lines - referred to herein as packaging cell lines « comprise yet another 

aspe.ct of the invention. 

To reduce the frequency of contamination with wild-type adenovirus, it is desirable to 
improve either the viral vector or the cell line to reduce the probability of recombination. For 
example, an adenovirus from a group with less homology to the group C viruses may be used 
to engineer recombinant viruses with little propensity for recombination with the Ad5 sequence 
in 293 cells. Similarly, an epithelial cell line - 293 or another - may be prepared according to 
within-disclosed methods which stably expresses adenovirus proteins or polypeptides from 
Ad3 and/or proteins or polypeptides from another non-group-C or group C serotype; such a 
cell line would is useful for supporting adenovirus-derived viral vectors bearing deletions of 
regulatory and/or structural genes, irrespective of the serotype from which such a vector was 
derived. 

It is also contemplated that the constructs and methods of the present invention will 
support the design and engineering of chimeric viral vectors which express amino acid residue 
sequences derived from two or more Ad serotypes. Thus, unlike methods and constructs 
available prior to the advent of the present disclosure, this invention allows the greatest 
possible flexibility in the design and preparation of useful viral vectors and cell lines which 
support their construction and propagation - all with a decreased risk of recombining with 
wild-type Ad to produce potentially-harmful recombinants. 
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g. polypeptide VII; 

h. polypeptide VET, and 

i. biologically active fragments thereof is disclosed. 

In one variation, the sequences are constitutively expressed; in another, one or more 
sequences is under the control of a regulatable promoter. In a preferred embodiment 
expression is constitutive. In various preferred embodiments, the polypeptides expressed by the 
DNA sequences are biologically active. 

In a further and preferred embodiment the packaging cell line of the present invention 
supports the production of a viral vector. In a preferred embodiment the viral vector is a 
therapeutic vector. 

In one aspect of the present invention, each DNA sequence is introduced into the 
genome of the within-disclosed cell lines via a separate complementing plasmid. In other 
embodiments, two or more DNA sequences were introduced into the genome via a single 
complementing plasmid. In one variation, the complementing plasmid comprises a DNA 
sequence encoding adenovirus fiber protein, polypeptide or fragment thereof. An example of a 
useful complementing plasmid according to the present invention is a plasmid having the 
characteristics of pCLF (for deposit details, see Example 3) . 

In another aspect of the present invention, the complementing plasmid used to 
transforms cell line of the present invention further comprises a DNA sequence encoding an 
adenovirus regulatory protein, polypeptide or fragment thereof. In one variation, the 
regulatory protein is selected from the group consisting of El A, E1B, E2A, E2B, E3, E4 and 
L4 (also referred to as "the 100K protein"); an exemplary complementing plasmid has the 
characteristics of is pE4/Hygro?? (for deposit details, see Example 3). In another aspect, the 
complementing plasmid used to transform a cell line of the present invention further comprises 
a DNA sequence encoding two or more of the above mentioned adenovirus regulatory 
proteins, polypeptides or fragments thereof. 

In one variation, the two or more regulatory proteins, polypeptides or fragments 
thereof are selected from the group consisting of El A, E1B, E2A, E2B, E3, E4 and L4 (also 
referred to as "the 100K protein"). In another variation, the structural protein is selected from 
the group consisting of penton base; hexon; fiber; polypeptide ma; polypeptide V; polypeptide 
VI; polypeptide VII; polypeptide VIE; and biologically active fragments thereof. 
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regulatory proteins (or portions thereof) that have been deleted. Thus, in one embodiment, the 
foreign DNA encodes a tumor-suppressor protein; a suicide protein; a cystic fibrosis 
transmembrane conductance regulator (CFTR) protein; or a biologically active fragment of any 

of them. 

Any of the within-disclosed cell lines may have a DNA sequence encoding all or part of 
a fiber protein - including modified or chimeric proteins - stably integrated into the genome, 
thus, in one variation, the fiber protein has been modified to include a non-native amino acid 
residue sequence which targets a specific receptor, but which does not disrupt trimer formation 
or transport of fiber into the nucleus. In one variation, the non-native amino acid residue 
sequence is coupled to the carboxyl terminus of the fiber. In yet another, the non-native amino 
acid residue sequence further includes a linker sequence. Alternatively, the fiber protein 
further comprises a ligand coupled to the linker. A suitable ligand may be selected from the 
group consisting of ligands that specifically bind to a cell surface receptor and ligands that can 
be used to couple other proteins or nucleic acid molecules. In one variation, the ligand is 
selected from the group consisting of ligands that specifically bind to a cell surface receptor 
and ligands that can be used to couple other proteins or nucleic acid molecules. 

In yet another embodiment, the non-native amino acid residue sequence is incorporated 
into the fiber amino acid residue sequence at a location other than one of the fiber termini. 
Alternatively, the non-native amino acid residue sequence alters the binding specificity of the 
fiber for a targeted cell type. In other embodiments, the linker sequence alters the binding 
specificity of the fiber for a targeted cell type. The expressed fiber may, in various 
embodiments, bind to a specific targeted cell type not usually targeted by adenovirus and/or 
may comprise amino acid residue sequences from more than one adenovirus serotype. 

In various aspects of the present invention, a packaging cell line of the present 
invention is derived from a procaryotic cell line; in another, it is derived from a eucaryotic cell 
line. While various embodiments suggest the use of mammalian cells, and more particularly, 
epithelial cell lines, a variety of other, non-epithelial cell lines are used in various embodiments. 
Thus, while various embodiments disclose the use of a cell line selected from the group 
consisting of 293, A549, W162, HeLa, Vero, 21 1, and 21 1 A cell lines, it is understood that 
various other cell lines are likewise contemplated for use as disclosed herein. 
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The invention further discloses complementing plasmids and methods of making same. 
In one embodiment, a complementing plasmid comprises a promoter nucleotide sequence 
operatively linked to a nucleotide sequence encoding an adenovirus structural polypeptide. In 
one variation, the complementing plasmid comprises pCLF. In another variation, a 
complementing plasmid further comprises a nucleotide sequence encoding a first adenovirus 
regulatory polypeptide, a nucleotide sequence encoding a second regulatory polypeptide, a 
nucleotide sequence encoding a third regulatory polypeptide; or any combination of the 
foregoing. In still another embodiment, the adenovirus structural polypeptide is selected from 
the group consisting of penton base; hexon; fiber; polypeptide Ula; polypeptide V; polypeptide 
VI; polypeptide VII; polypeptide VHI; and biologically active fragments thereof. 

The present invention also discloses a complementing plasmid comprising a promoter 
nucleotide sequence operatively linked to a nucleotide sequence encoding an adenovirus 
structural protein, polypeptide or fragment thereof and a nucleotide sequence encoding an 
adenovirus regulatory protein, polypeptide or fragment thereof. In one variation, the early 
region polypeptide is E4; in another, the plasmid comprises pE4/Hygro. In still another 
variation, the early region polypeptides are El and E4. Complementing plasmids further 
comprising a nucleotide sequence encoding an adenovirus structural protein, polypeptide or 
fragment thereof are also contemplated, as are plasmids wherein thepromoter nucleotide 
sequence is selected from the group consisting of MMTV, CMV and E4 promoter nucleotide 
sequences. 

Viral vectors are also disclosed which comprise nucleotide sequences encoding a 
packaging signal and a foreign protein or polypeptide, wherein the nucleotide sequence 
encoding an adenovirus structural protein has been deleted. In one variation, the nucleotide 
sequence encoding the foreign protein or polypeptide is a DNA molecule up to about 3 kb in 
length; in another, the nucleotide sequence encoding the foreign protein or polypeptide is a 
DNA molecule up to about 9.5 kb in length; in still another, the nucleotide sequence encoding 
the foreign protein or polypeptide is a DNA molecule up to about 1 2.5 kb in length. 
Nucleotide sequences of intermediate lengths are also contemplated by the present invention, 

as are sequences in excess of 12.5 kb. 

The invention also discloses viral vectors wherein the sequence encoding a foreign 
protein or polypeptide is a sequence encoding an anti-tumor agent, a tumor suppressor protein, 
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further comprises a nucleotide sequence encoding a foreign polypeptide. In various 
embodiments, the polypeptide is a therapeutic molecule. 

Another embodiment discloses a composition as before, wherein the first delivery 
plasmid lacks adenovirus packaging signal sequences. In another aspect, the second delivery 
plasmid contains a LacZ reporter construct. Another variation discloses that the second 
delivery plasmid further lacks a nucleotide sequence encoding an adenovirus regulatory 
protein. In one variation, the regulatory protein is El. In one embodiment of the above-noted 
compositions, the complementing plasmid has the characteristics of pCLF. 

In another embodiment, a composition is disclosed wherein the first delivery plasmid t 
lacks a nucleotide sequence encoding an adenovirus structural protein and the second delivery 
plasmid lacks a nucleotide sequence encoding adenovirus El protein. In another, the first 
delivery plasmid lacks a nucleotide sequence encoding adenovirus E4 protein and the second 
delivery plasmid lacks a nucleotide sequence encoding adenovirus El protein. In still another, 
the cell contains at least one complementing plasmid encoding an adenoviral regulatory protein 

and a structural protein. 

In alternative embodiments, the regulatory protein is E4 and the structural protein is 
fiber; or the regulatory protein is El and the structural protein is fiber. In still another 
embodiment, the adenoviral regulatory protein and the structural protein are encoded by 
separate complementing plasmids. 

Another variation discloses a composition wherein the cell is selected from the group 
consisting of 293, A549, W162, HeLa, Vero, 211, and 21 1 A. In another embodiment, the 
delivery plasmid is DV1, or p E1B gal, or p ElsplB. 

Various methods of making and using the vectors, plasmids, cell lines and other 
compositions and constructs of the present invention are also disclosed herein. The following 
methods are considered exemplary and not limiting. 

Thus, in one variation, the invention discloses a method of constructing therapeutic 
viral vectors, comprising introducing a delivery plasmid into an Ad fiber-expressing 
complementing cell line, wherein the DNA sequence encoding Ad fiber protein has been 
deleted from the delivery plasmid. In one variation, the delivery plasmid further includes a 
DNA sequence encoding a foreign protein, polypeptide, or fragment thereof. In other 
embodiments, the deliver)' plasmid is DV1, p E1B gal, p ElsplB, or similar constructs. 
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Another variation discloses that the composition and the therapeutic agent are each included in 
a separate receptacle or container. 

It will also be appreciated that any combination of the preceding elements may also be 
efficacious as described herein, and that all related methods are also within the scope of the 
present invention. 

nPTFF nrarRIP TlON OF THE DRAWINGS 

Figure 1 is a schematic diagram of the entire adenoviral E4 transcriptional unit with the 
open reading frames (ORF) indicated by blocked segments along with the promoter and 
terminator sequences. The location of primers for amplifying specific portions of E4 are also 
indicated as further described in Example 1 A. 

Figure 2 is a schematic map of plasmid pE4yHygro as further described in Example IB. 

Figure 3 is a schematic map of plasmid pCDNA3>Fiber as further described in Example 

IB. 

Figure 4 is a schematic map of plasmid pCLF as further described in Example IB. 

Figure 5 is a photograph of a Southern blot showing the presence of intact adenovirus 
E4 3.1 kilobase (kb) insert in the 21 1 cell line as further described in Example 1C 

Figure 6 is a photograph of a Western blot showing labeled fiber protein detected under 
native and denaturing electrophoresis conditions as described in Example 1C. The 293 cells 
lack fiber while the sublines 21 1 A, 21 IB and 21 1R contain fiber protein detectable in 
functional trimerized form and denatured monomelic form. 

Figure 7 is a schematic map of plasmid pDEX/El as further described in Example ID. 

Figure 8 is a schematic map of plasmid pEl/Fiber as further described in Example IF1- 

Figure 9 is a schematic map of plasmid pE4/Fiber as further described in Example 1F2). 

Figure 10 is a schematic illustration of linearized l^f IB gal delivery 

plasmids for use in cotransfection and recombination to form a recombinant adenoviral vector 
having multiple adenoviral gene deletions. The plasmids and recombination event are more 
fully described in Example 2A. 

Figure 1 1 is a schematic of plasmid pi 1.3 as further described in Example 2A used in 
the construction of pDV44 delivery plasmid with plasmid p8.2. 
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serial CsQcentrifugations. 10 mg of the purified v*u particles was then electrophoresed under 
denaturing conditions and transferred to aPVDF membrane. Ad5 fiber was detected with a polyclonal 

r , u „ nositive control for detection, 400 ng of 
rabbit antibody raised against recombinant Ad2 fiber. As a positive conrro 
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AdS fibers are indistinguishable and the antibody reacts with both proteins. 

Fig.16. Nuclear localization of the recombinant fiber protein in three packaging cell lines: Cells 
were grown on 8-well chamber slide,, stained with a rabbit and-fiber polyclonal andbody and visualized 
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with a FITC-conjugated goat anti-rabbit antibody. A) line 211 A. B) Line211B C) Line211R D) 
293 cells (negative control). E) 293 cells infected with Ad.RSVbgal at 1 pfu/cell and stained 24 hour 
post-infection (positive control). F) Infected cells prepared as in .(E) but stained without the primary 



antibody. 



DETAILED DESCRIPTION To reduce the frequency of contamination with wild-type 
adenovirus, it is considered desirable to improve either the viral vector or the cell line to reduce 
the probability of recombination. For example, an adenovirus from a group with less 
homology to the group C viruses may be used to engineer recombinant viruses with little 
propensity for recombination with the Ad5 sequence in 293 cells. Similarly, an epithelial cell 
line - e.g. the cell line known as 293 » may be used or further modified according to within- 
disclosed methods which stably expresses adenovirus proteins or polypeptides from Ad3 
and/or proteins or polypeptides from another non-group-C or group C serotype; such a cell 
line would be useful to support adenovirus-derived viral vectors bearing deletions of regulatory 
and/or structural genes, irrespective of the serotype from which such a vector was derived. 

It is also contemplated that the constructs and methods of the present invention will 
support the design and engineering of chimeric viral vectors which express amino acid residue 
sequences derived from two or more Ad serotypes. Thus, unlike methods and constructs 
available prior to the advent of the present disclosure, this invention allows the greatest 
possible flexibility in the design and preparation of useful viral vectors and cell lines which 
support their construction and propagation - all with a decreased risk of recombining with 
wild-type Ad to produce potentially-harmful recombinants. 

In part, the present invention discloses a simpler, alternative means of reducing the 
recombination between viral and cellular sequences than those discussed in the art. One such 
means is to increase the size of the deletion in the recombinant virus and thereby reduce the 
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Poly peptide and Peptide : These terms are used interchangeably herein to designate a 
series of no more than about 50 amino acid residues connected one to the other by peptide 
bonds between the alpha-amino and carboxy groups of adjacent residues. 

Receptor : Receptor is a term used herein to indicate a biologically active molecule that 
specifically binds to (or with) other molecules. The term "receptor protein" may be used to 
more specifically indicate the proteinaceous nature of a specific receptor. 

Transpene or Therapeutic NnHeotide Sequence : As described and claimed herein, such 
a sequence includes DNA and RNA sequences encoding an RNA or polypeptide. Such 
sequences may be "native" or naturally-derived sequences; they may also be "non-native" or 
"foreign" sequences which are naturally- or recombinantly-derived. The term "transgene ," 
which may be used interchangeably herein with the term "therapeutic nucleotide sequence," is 
often used to describe a heterologous or foreign (exogenous) gene that is carried by a viral 
vector and transduced into a host cell. 

Therefore, therapeutic nucleotide sequences include antisense sequences or nucleotide 
sequences which may be transcribed into antisense sequences. Therapeutic nucleotide 
sequences (or transgenes) further comprise sequences which function to produce a desired 
effect in the cell or cell nucleus into which said therapeutic sequences are delivered. For 
example, a therapeutic nucleotide sequence may encode a functional protein intended for 
delivery into a cell which is unable to produce that functional protein. 

Expression or Delivery Vector : Any plasmid or virus into which a foreign DNA may 
be inserted for expression in a suitable host cell -- i.e., the protein or polypeptide encoded by 
the DNA is synthesized in the host cell's system. Vectors capable of directing the expression 
of DNA segments (genes) encoding one or more proteins are referred to herein as "expression 
vectors". Also included are vectors which allow cloning of cDNA (complementary DNA) 
from mRNAs produced using reverse transcriptase. 

Adenoviral Vector or Ad-Derived Vector : Any adenovirus-derived plasmid or virus 
into which a foreign DNA may be inserted or expressed. This term may also be used 
interchangeably with "viral vector" This "type" of vector may be utilized to carry nucleotide 
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Fiber plays a crucial role in adenovirus infection by attaching the virus to a specific 
he cell surface The fiber consists of three domains: an N-terrninal tail that 
receptor on the cell surface. re peats of a 15-amino-acid segment that 

interacts with penton base; a shaft composed of 22 repeats 
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The fiber is an elongated protein which exists as a trimer of three identical polypeptides 
(polypeptide IV) of 582 amino acids in length. The N-terminus of the fiber mediates binding to 
the penton base to form what is generally called the penton capsomere. The C-terminus of the 
fiber is involved in initial binding of the virus to cellular receptors. 

The 35,000+ base pair (bp) genome of adenovirus type 2 has been sequenced and the 
predicted amino acid sequences of the major coat proteins (hexon, fiber and penton base) have 
been described. (See, e.g., Neumann et al., Gene 69 : 153-157 (1988); Herisse et al., Nuc, 
Acids Res. 9 : 4023^041 (1981); Roberts et al., J. Biol. Chem. 259 : 13968-13975 (1984); 
Kinloch et al., I Biol. Chem. 259 : 6431-6436 (1984); and Chroboczek et al., Virol. 16}: 
549-554 (1987).) 

The sequence of Ad5 DNA was completed more recently; its sequence includes a total 
of 35,935 bp. Portions of many other adenovirus genomes have also been sequenced. It is 
presently understood that the upper packaging limit for adenovirus virions is about 105% of 
the wild-type genome length. (See, e.g., Bett, et al., J. Virol. 67(10): 591 1-21 (1993).) Thus, 
for Ad2 and Ad5, this would be an upper packaging limit of about 38kb of DNA. 

Adenovirus DNA also includes inverted terminal repeat sequences (ITRs) ranging in 
size from about 100 to 150 bp, depending on the serotype. The inverted repeats enable single 
strands of viral DNA to circularize by base-pairing of their terminal sequences, and the 
resulting base-paired "panhandle" structures are thought to be important for replication of the 
viral DNA. 

For efficient packaging, the ITRs and the packaging signal (a few hundred bp in length) 
appear to comprise the "minimum requirement." Helper-dependent vectors lacking all viral 
ORFs but including these essential cis elements (the ITRs and contiguous packaging sequence) 
have been constructed, but the virions package less efficiently that the helper and package as 
multimers part of the time, which suggests that the virus may "want" to package a fuller DNA 
complement (see, e.g., Fisher, et al., Virology 217: 1 1-22 (1996). 
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C. Packaging Cell Lines 

The first generation of recombinant adenoviral vectors currently available tend to have 
a deletion in the first viral early gene region which is generally referred to as El , which 
comprises the El a and El b regions. (These regions typically span genetic map units 1.30 to 
9.240 Figure 3 in chapter 67 nf Fields Virology . 3d Ed. (Fields et al. (eds.), Lippincott-Raven 
Publ., Philadelphia, (1996), p. 21 16) illustrates a transcription and translation map of 
adenovirus type 2 (Ad2) that is a helpful example. 

According to various published reports, deletion of the viral El region renders the 
recombinant adenovirus defective for replication and incapable of producing infectious viral 
particles in the subsequently-infected target cells. Thus, the ability to generate El -deleted 
adenovirus is often based on the availability of the human embryonic kidney packaging cell line 
called 293. This cell line contains the El region of adenovirus, which provides El gene region 
products to "support" the growth of El -deleted virus in the cell line (see, e.g., Graham et aL, L 

Gen. Virol. 36 : 59-71 (1977)). 

Nevertheless, the inherent problems with current first-generation recombinant 
adenoviruses have raised increasing concerns about their use in patients. For example, several 
recent studies have shown that El -deleted adenoviruses are not completely replication- 
incompetent (see Rich, Hum. Gene. Then 4 : 461-476 (1993); Engelhardt, et al., Nature Genet. 
4:27-34(1993)). 

Three general limitations are associated with the adenoviral vector technology. First, 
infection both in vivo and in vitro with the adenoviral vector at high multiplicity of infection 
("MOI") has resulted in cytotoxicity to the target cells, due to the accumulation of penton 
protein, which is itself toxic to mammalian cells (Kay, Cell Biochem. 17E: 207 (1993)). 
Second, host immune responses against adenoviral late gene products, including penton 
protein, cause the inflammatory response and destruction of the infected tissue which received 
the vectors (Yang, et al., PNAS USA 92 : 4407-441 1 (1994)). Lastly, host immune responses 
and cytotoxic effects together prevent the long-term expression of transgenes and cause 
decreased levels of gene expression following subsequent administration of adenoviral vectors 
(Mittal, et al., Virus Res. 28 : 67-90 (1993)). 
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Thus, a recombinant DNA molecule (rDNA) of the present invention is a hybrid DNA 
molecule comprising at least 2 nucleotide sequences not normally found together in nature. In 
various preferred embodiments, one of the sequences is a sequence encoding an Ad-derived 
polypeptide, protein, or fragment thereof. Stated another way, a therapeutic nucleotide 
sequence of the present invention is one that encodes an expressible protein, polypeptide or 
fragment thereof, and it may further include an active constitutive or regulatable (e.g. 

inducible) promoter sequence. 

A therapeutic viral vector or composition of the present invention is optimally from 
about 20 base pairs to about 40,000 base pairs in length. Preferably the nucleic acid molecule 
is from about 50 bp to about 38,000 bp in length. In various embodiments, the nucleic acid 
molecule is of sufficient length to encode one or more adenovirus proteins or functional 
polypeptide portions thereof. Since individual Ad polypeptides vary in length from about 19 
amino acid residues to about 967 amino acid residues, corresponding nucleotide sequences will 
range from about 50 bp up to about 3000 bp, depending on the size and of individual 
polypeptide-encoding sequences that are "replaced" in the viral vectors by therapeutic 
nucleotide sequences of the present invention. 

Various Ad proteins are comprised of more than one polypeptide sequence. Thus, 
deletion of the corresponding genes from an Ad vector as taught herein will thus allow the 
vector to accommodate even larger "foreign" DNA segments. Thus, if the sequences encoding 
one or more adenovirus polypeptides or proteins are supplanted by a recombinant nucleotide 
sequence of the present invention, the length of the recombinant sequence can conceivably 
extend nearly to the packaging limit of the relevant adenovirus-derived vector. 

In view of the fact that preferred embodiments disclosed herein are helper-independent 
Ad-derived vectors, the entire wild-type Ad genome cannot be completely supplanted by 
recombinant nucleic acid molecules without transforming such a vector into a vector requiring 
"help" of some kind. However, the Ad-derived vectors of the present invention do not depend 
on a helper virus; instead, the vectors of the present invention are propagated in cell lines 
stably expressing proteins or polypeptides that have been removed from said vectors to allow 
the addition of "foreign" DNA into the vectors. In various disclosed embodiments, specific 
early region and structural polypeptides are deleted from the vectors of the present invention, 
thereby enabling the vectors to accommodate recombinant nucleic acid sequences (or 
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As noted elsewhere herein, an Ad-derived vector of the present invention may also 
include a promoter sequence. Both constitutive and regulatable (often called "inducible") 
promoters are useful in constructs and methods of the present invention. For example, some 
useful regulatable promoters are those of the CREB-regulated gene family and include - and - 
inhibin, - gonadotropin, cytochrome c, glucagon, and the like. (See, e.g., published 
International App. No. WO96/14061, the disclosures of which are incorporated by reference 

herein.) 

A regulatable or inducible promoter may be described as a promoter wherein the rate of 
RNA polymerase binding and initiation is modulated by external stimuli. Such stimuli include 
various compounds or compositions, light, heat, stress, chemical energy sources, and the like. 
Inducible, suppressive and repressive promoters are considered regulatable promoters. 

Regulatable promoters may also include tissue-specific promoters. Tissue-specific 
promoters direct the expression of the gene to which they are operably linked to a specific cell 
type. Tissue-specific promoters cause the gene located 3' of it to be expressed predominantly, 
if not exclusively, in the specific cells where the promoter expressed its endogenous gene. 
Typically, it appears that if a tissue-specific promoter expresses the gene located 3' of it at all, 
then it is expressed appropriately in the correct cell types (see, e.g., Palmiter et al., Ann. Rev. 

Genet. 20 : 465-499 ( 1 986)). 

When a tissue-specific promoter controls the expression of a gene, that gene will be 
expressed in a small number of tissues or cell types rather than in substantially all tissues and 
cell types. Examples of tissue-specific promoters include the immunoglobulin promoter 
described by Brinster et al., Nature 306 : 332-336 (1983) and Storb et al., Nature 310: 238-231 
(1984); the elastase-I promoter described by Swift et al., Cell 38 : 639-646 (1984); the globin 
promoter described by Townes et al., Mol. Cell. Biol. 5 : 1977-1983 (1985), and Magram et 
al., Mol. Cell. Biol. 9 : 4581-4584 (1989), the insulin promoter described by Bucchini et al., 
PNAS USA. 83 : 2511-2515 (1986) and Edwards et al.. Cell 58 : 161 (1989); the 
immunoglobulin promoter described by Ruscon et al., Nature 314 : 330-334 (1985) and 
Grosscheld et al., Cell 38 : 647-658 (1984); the alpha actin promoter described by Shani, Mol T 
Cell. Biol. 6 : 2624-2631 (1986); the alpha crystalline promoter described by Overbeek et al., 
PNAS USA 82 : 7815-7819 (1985); the prolactin promoter described by Crenshaw et al., 
Genes and Development 3 : 959-972 (i989); the proopiomelanocortin promoter described by 
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nucleotide sequence. As described herein, a number of adenovirus-derived moieties are useful 
in the presently-disclosed therapeutic compositions and methods. 

While some of the Examples appearing below specifically recite fiber proteins, 
polypeptides, and fragments thereof, it is expressly provided herein that other structural and 
non-structural Ad proteins and polypeptides (e.g., regulatory protein s and polypeptides) may 
be used as components of the various disclosed vectors and cell lines. Moreover, chimeric 
molecules comprised of proteins, polypeptides, and/or fragments thereof which are derived 
from different Ad serotypes may be used in any of the within-disclosed methods, constructs 
and compositions. Similarly, recombinant DNA sequences of the present invention may be 
prepared using nucleic acid sequences derived from different Ad serotypes, in order to design 
useful constructs with broad applicability, as disclosed and claimed herein. 

It should also be appreciated that, while the members of Group C adenovirus - i.e., Ad 
serotypes 1 , 2, 5, and 6 - are specifically recited in various examples herein, the present 
invention is in no way limited to those serotypes alone. In view of the fact that the adenovirus 
serotypes are all closely-related in structure and functionality, therapeutic viral vectors, 
packaging cell lines, and plasmids of the present invention may be constructed from 
components of any and all Ad serotypes - and the within-disclosed methods of making and 
using the various constructs and cell lines of the present invention apply to all of said 
serotypes. 

The preparation of a pharmacological composition that contains active ingredients 
dissolved or dispersed therein is well understood in the art. Typically such compositions are 
prepared as injectables ~ either as liquid solutions or suspensions - however, solid forms 
suitable for solution or suspension in liquid prior to use can also be prepared. A preparation 
can also be emulsified, or formulated into suppositories, ointments, creams, dermal patches, or 
the like, depending on the desired route of administration. 

The active ingredient can be mixed with excipients which are pharmaceutically 
acceptable and compatible with the active ingredient and in amounts suitable for use in the 
therapeutic methods described herein. Suitable excipients are, for example, water, saline, 
dextrose, glycerol, ethanol or the like and combinations thereof, including vegetable oils, 
propylene glycol, polyethylene glycol and benzyl alcohol (for injection or liquid preparations); 
and petrolatum (e.g., VASELINE), vegetable oil, animal fat and polyethylene glycol (for 
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(1979); U.S. Patent No. 4,356,270; and Brown et al., Meth. Enzvmol., 68:109, (1979), the 
disclosures of which are incorporated by reference herein. 

For therapeutic oligonucleotides sequence compositions in which a family of variants is 
preferred, the synthesis of the family members can be conducted simultaneously in a single 
reaction vessel, or can be synthesized independently and later admixed in preselected molar 
ratios. For simultaneous synthesis, the nucleotide residues that are conserved at preselected 
positions of the sequence of the family member can be introduced in a chemical synthesis 
protocol simultaneously to the variants by the addition of a single preselected nucleotide 
precursor to the solid phase oligonucleotide reaction admixture when that position number of 
the oligonucleotide is being chemically added to the growing oligonucleotide polymer. The 
addition of nucleotide residues to those positions in the sequence that vary can be introduced 
simultaneously by the addition of amounts, preferably equimolar amounts, of multiple 
preselected nucleotide precursors to the solid phase oligonucleotide reaction admixture during 
chemical synthesis. For example, where all four possible natural nucleotides (A,T,G and C) are 
to be added at a preselected position, their precursors are added to the oligonucleotide 
synthesis reaction at that step to simultaneously form four variants. 

This manner of simultaneous synthesis of a family of related oligonucleotides has been 
previously described for the preparation of "Degenerate Oligonucleotides" by Ausubel et al. 
r Current Protocols in Molecular Biology . Suppl. 8. p.2.1 1.7, John Wiley & Sons, Inc., New 
York (1991)), and can readily be applied to the preparation of the therapeutic oligonucleotide 
compositions described herein. 

Nucleotide bases other than the common four nucleotides (A,T,G or C), or the RNA 
equivalent nucleotide uracil (U), can also be*used in the present invention. For example, it is 
well known that inosine (I) is capable of hybridizing with A, T and G, but not C. Examples of 
other useful nucleotide analogs are known in the art; many may be found listed in 37 C.F.R. 
§1.822. 

Thus, where all four common nucleotides are to occupy a single position of a family of 
oligonucleotides, that is, where the preselected therapeutic nucleotide composition is designed 
to contain oligonucleotides that can hybridize to four sequences that vary at one position, 
several different oligonucleotide structures are contemplated. The composition can contain 
four members, where a preselected position contains A,T,G or C. Alternatively, the 
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"operatively linked" generally means the sequences or segments have been covalently joined 
into one piece of DNA, whether in single or double stranded form. 

The choice of viral vector into which a therapeutic nucleotide sequence of this 
invention is operatively linked depends directly, as is well known in the art, on the functional 
properties desired, e.g., vector replication and protein expression, and the host cell to be 
transformed .-- these being limitations inherent in the art of constructing recombinant DNA 
molecules. Although certain adenovirus serotypes are recited herein in the form of specific 
examples, it should be understood that the present invention contemplates the use of any 
adenovirus serotype, including hybrids and derivatives thereof. As one will observe, it is not 
unusual or outside the scope of the present invention to utilize nucleotide and/or amino acid 
residue sequences of two or more serotypes in constructs, compositions and methods of the 
invention. 

As one of skill in the art will note, in various embodiments of the present invention, 
different "types" of vectors are disclosed. For example, one "type" of vector is used to deliver 
particular nucleotide sequences into a packaging cell line, with the intent of having said 
sequences stably integrate into the cellular genome; these "types" of vectors are generally 
identified herein as complementing plasmids. A further "type" of vector described herein 
carries or delivers nucleotide sequences in or into a cell line (e.g., a packaging cell line) for the 
purpose of propagating therapeutic viral vectors of the present invention; hence, these vectors 
are generally referred to herein as delivery plasmids. A third "type" of vector described herein 
is utilized to carry nucleotide sequences encoding therapeutic proteins or polypeptides to 
specific cells or cell types in a subject in need of treatment; these vectors are generally 
identified herein as therapeutic viral vectors or Ad-derived vectors. 

In one embodiment, the directional ligation means is provided by nucleotides present in 
the upstream nucleotide sequence, downstream nucleotide sequence, or both. In another 
embodiment, the sequence of nucleotides adapted for directional ligation comprises a sequence 
of nucleotides that defines multiple directional cloning means. Where the sequence of 
nucleotides adapted for directional ligation defines numerous restriction sites, it is referred to 

as a multiple cloning site. 

A translatable nucleotide sequence is a linear series of nucleotides that provide an 
uninterrupted series of at least 8 codons that encode a polypeptide in one reading frame. 
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E. Therapeutic Methods 

The vectors of the present invention are particularly suited for gene therapy. Thus, 
various therapeutic methods are contemplated by the present invention. 

For example, it has now been discovered that Ad-derived viral vectors are capable of 
delivering a therapeutic nucleotide sequence to a specific cell or tissue, thereby expanding and 
enhancing treatment options available in numerous conditions in which more conventional 
therapies are of limited efficacy. Accordingly, methods of gene therapy utilizing these vectors 
are within the scope of the invention. Vectors are typically purified and then an effective 
amount is administered in vivo or ex vivo into the subject. 

For example, the compositions may be used prophylactically or therapeutically in vivo 
to disrupt HIV infection and mechanisms of action by inhibiting gene expression or activation, 
via delivery of antisense HTV sequences or ribozymes to T cells or monocytes. Using methods 
of the present invention, one may target therapeutic viral vectors as disclosed herein to specific 
cells and tissues, including hematopoietic cells, as infection of such cells appears to be 
mediated by distinct integrins to which viral vectors of the present invention may readily be 
targeted. (See, e.g., Huang, et al., J. Virol. 70 : 4502-8 (1996).) 

Other useful therapeutic nucleotide sequences include antisense nucleotide sequences 
complementary to EBV EBNa-1 gene. Use of such therapeutic sequences may remediate or 
prevent latent infection of B cells with EBV. As discussed herein and in the Examples below, 
targeting and delivery may be accomplished via the use of various ligands, receptors, and other 

appropriate targeting agents. 

Thus, in one embodiment, a therapeutic method of the present invention comprises 
contacting the cells of a subject infected with EBV or HTV with a therapeutically effective 
amount of a pharmaceutically acceptable composition comprising a therapeutic nucleotide 
sequence of this invention. In a related embodiment, the contacting involves introducing the 
therapeutic nucleotide sequence composition into cells having an EBV or HTV-mediated 
infection. 

Methods of gene therapy are well known in the art (see, e.g., Larrick and Burck, Gene 
Therapy: Application of Molecular Biology . Elsevier Science Publ. Co., Inc., New York, NY 
(1991); Kriegler, Gene Transfer and Expres sion: A Laboratory Manual. W. H. Freeman and 
Company, New York (1990)). The term "subject" should be understood to include any animal 
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The dose of a biologic vector is somewhat complex and may be described in terms of 
the concentration (in plaque-forming units per milliliter (pfu/ml)), the total dose (in pfus), and 
the estimated number of vectors administered per cell (the estimated multiplicity of infection or 
MOI). Thus, if a vector is administered via infusion - say, across nasal epithelium - at a 
constant total volume, the respective concentration, etc. may be described as follows: 



Concentration Volume Dose Estimated 

(pfu/mi) (ml) MQi 

10 7 2 2xl0 7 1 

10 8 2 2 x 10 8 10 
j 0 9 2 2 x 10 9 100 
10 10 2 2xl0 10 1000 



In general, when adenoviral vectors are administered via infusion across the nasal 
epithelium, administered.amounts producing an estimated MOI of about 10 or greater are 
much more effective than lower dosages. (See, e.g., Knowles, et al., New Eng. J. Med, 333: 
823-831 (1995).) Similarly, when direct injection is the preferred treatment modality - e.g., 
direct injection of a viral vector into a tumor - doses of 1 x 10 9 pfu or greater are generally 
preferred. (See, e.g., published International App. No. W095/1 1984.) 

Thus, depending on the mode of administration, an effective amount administered in a 
single dose preferably contains from about 10 6 to about 10 15 infectious units. A typical 
course of treatment would be one such dose per day over a period of five days. As those of 
skill in the an will appreciate, an effective amount may vary depending on (1) the pathology or 
other condition to be treated, (2) the status and sensitivity of the patient, and (3) various other 
factors well known to those of skill in the art, such as the patient's tolerance to other courses 
of treatment that may have been applied previously. Thus, those of skill in the art may easily 
and precisely determine effective amounts of the agents/vectors of the present invention which 
may be administered to a particular patient, based on their understanding of and evaluation of 
such factors. 
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cellular growth in a patient, thereby alleviating the symptoms or the disease or cachexia present 
in the patient. The effect of this treatment includes, but is not limited to, prolonged survival 
time of the patient, reduction in rumor mass or burden, apoptosis of tumor cells, or the 
reduction in the number of circulating tumor cells. Means of quantifying the beneficial effects 
of this therapy are well known to those of skill in the art. 

The present invention provides a recombinant adenovirus expression vector 
characterized by the partial or total deletion of one or more adenoviral structural protein genes, 
such as the gene encoding fiber, which allows the vector to accommodate a therapeutic, 
foreign nucleic acid sequence encoding a functional foreign polypeptide, protein, or 
biologically active fragment thereof. For example, such a functional polypeptide moiety may 
be a suicide gene or a functional equivalent thereof, of which the anti-cancer gene TK is but 
one example. TK genes, when expressed, produce a gene product which is lethal to the cell, 
particularly in the presence of GCV. One source of the TK gene is the herpes simplex virus 
(HSV), albeit other sources are known as well and may be used as taught herein. The TK gene 
may readily be obtained from HSV by methods well known to those of skill in the art. For 
example, the plasmid pMLBKTK in E. coli HB101 (from ATCC #39369) is a source of the 
HSV- 1 TK gene, which may be used as disclosed herein. (See, e.g. published International 
application No. WO 95/1 1984, the disclosures of which are incorporated by reference herein.) 

A therapeutic gene sequence may be introduced into a tumor mass by combining the 
adenoviral expression vector with a suitable pharmaceutically acceptable carrier. Introduction 
can be accomplished, for example, via direct injection of the recombinant Ad vector into the 
tumor mass. In the specific case of a cancer such as hepatocellular carcinoma (HCC), direct 
injection into the hepatic artery can be used for delivery, because most HCCs derive their 
circulation from this artery. Similar techniques of administration may be applied to other 
specific types of tumors and malignancies, as is known to those of skill in the art. 

A method of tumor-specific delivery of a tumor-suppressor gene is accomplished by 
contacting target tissue in a subject with an effective amount of a recombinant Ad-derived 
vector of this invention. In the case of anti-tumor therapy, the gene is intended to encode an 
anti-tumor agent, such as a functional tumor suppressor gene product or suicide gene product. 
The term "contacting" is intended to encompass any delivery method for the efficient transfer 
of the vector, such as via intra-tumoral injection. 
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example, aerosol administration and administration via subcutaneous, intravenous, 
intraperitoneal, intramuscular, ocular means and the like are also within the scope of the 
present invention. 

Other gene-delivery methods are also useful in conjunction with the methods, 
compositions and constructs of the present invention; see, e.g., published International 
Application No. WO 95/1 1984, the disclosures of which are incorporated by reference herein. 

Similarly, various non-human animals having inserted therein the vectors or 
transformed cells of this invention. These "transgenic" animals are made using methods well 
known to those of skill in the art. For example, see U.S. Patent No. 5,175,384 (the disclosures 
of which are incorporated by reference herein). 

The present invention also contemplates various methods of targeting specific cells - 
e.g., cells in a subject in need of diagnosis and/or treatment As discussed herein, the present 
invention contemplates that the viral vectors and compositions of the present invention may be 
directed to specific receptors or cells, for the ultimate purpose of delivering those vectors and 
compositions to specific cells or cell types. The viral vectors and constructs of the present 
invention are particularly useful in this regard. 

In general, adenovirus attachment and uptake into cells are separate but cooperative 
events that result from the interaction of distinct viral coat proteins with a receptor for 
attachment and v integrin receptors for internalization. Adenovirus attachment to the cell 
surface via the fiber coat proteins has been discovered to be dissociable and distinct from the 
subsequent step of internalization, and the present invention is able to take advantage of and 
function in conjunction with these differing receptors. 
G. Other Applications 

The cell lines, viral vectors and methods of the present invention may also be used for 
purposes other than the direct administration of therapeutic nucleotide sequences. In one such 
application, the production of large quantities of biologically active proteins or polypeptides in 
cells transfected with the within-disclosed viral vectors is contemplated herein. For example, 
human lymphoblastoid cells may be transfected with an integrative viral vector of the present 
invention carrying a human hematopoietic growth factor such as the gene for erythropoietin 
(EPO); cells so transfected are thus able to produce biologically active EPO. (See, e.g., Lopez 
et al., Gene 148 : 285-91 (1994).) 
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thE pre p„a,ion ol adenovirus pacing » - ^ ^ ^ 

— - — emb " 0n ' C Tendons, cell line (ATCC Accession 
Ac cession Number CRL 15731 HeU. a 1— «p* « «~ ^ 

M v- rn 2V A549 a human lung carcinoma cell line (A 1 IX ac 
NumberCCL2),A54y,a As a resu i t of the adenovirus transformauon, 

, 889); and the like epithelial-derived cell line • As result 
the 293 cells contain the El early region regulatory gene. • All cells .ere 
, M + 1 0% fetal calf serum unless otherwise noted. 

^tZ^L i— - * - - - - - 

h d acne deliver vecors having detaions in prcselcclcd gene reg,ona by cellular 
adenovrnas-baaed gene dehvery complemeltt »don of snch 

conrplemenlalion of adenov,ral genea. To pro ^ ^ 
de« adenoviral gcno.es in order ,0 general, a novel vrral vec»r P 

• ^cpWted functional units were designed as aescnu 



WO 98/13499 



41 



PCT/EP97/05251 



A. 




Adenoviruses 

The viral E4 regulatory region contains a single transcription unit which is alternately 
spliced to produce several different mRNAs. The E4-expressing plasmid prepared as described 
herein and used to transfect the 293 cell line contains the entire E4 transcriptional unit as 
shown in Figure 1 . A DNA fragment extending from 175 nucleotides upstream of the E4 
transcriptional stan site including the natural E4 promoter to 153 nucleotides downstream of 
the E4 polyadenylation signal including the natural E4 terminator signal, corresponding to 
nucleotides 32667-35780 of the adenovirus type 5 (hereinafter referred to as Ad5) genome as 
described in Chroboczek et al. (Virol. . 186:280-285 (1992), GenBank Accession Number 
M73260), was amplified from Ad5 genomic DNA, obtained from the ATCC, via the 
polymerase chain reaction (PCR). Sequences of the primers used were 
5 ' CGGT ACAC AGAATTC AGG AG AC AC AACTCC3 ' (forward or 5' primer referred to as 
E4L) (SEQ ID NO 1) and S'GCCTGGAJGCGGGAAGTTACGTAACGTGGGAAAACS 1 
(SEQ ID NO 2) (backward or 3' primer referred to as E4R). To facilitate cloning of the PCR 
fragment, these oligonucleotides were designed to create novel sites for the restriction enzymes 
EcoRI and BamHI, respectively, as indicated with underlined nucleotides. DNA was amplified 
via PCR using 30 cycles of 92 C for 1 minute, 50 C for 1 minute, and 72 C for 3 minutes 
resulting in amplified full-length E4 gene products. 

The amplified DNA E4 products were then digested with EcoRI and BamHI for 
cloning into the compatible sites of pBluescript/SK+ by standard techniques to create the 
plasmid pBS/E4. A 2603 base pair (bp) cassette including the herpes simplex virus thymidine 
kinase promoter, the hygromycin resistance gene, and the thymidine kinase polyadenylation 
signal was excised from the plasmid pMEP4 (Invitrogen, San Diego, CA) by digestion with 
Fspl followed by addition of BamHI linkers (5'CGCGGATCCGCG3*) (SEQ ID NO 3) for 
subsequent digestion with BamHI to isolate the hygromycin-containing fragment. 

- The isolated BamHI-modified fragment was then cloned into the BamHI site of pBS/E4 
containing the E4 region to create the plasmid pE4/Hygro containing 8710 bp (Figure 2). The 
pE4/Hygro plasmid has been deposited with the ATCC as described in Example 3. The 
complete nucleotide sequence of pE4/Hygro is listed in SEQ ID NO 4. Position number 1 of 
the linearized vector corresponds to approximately the middle portion of the pBS/SK+ 
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• fw 2 as a thin line between the 3' BamK site in the hygromycin 
back bone as shown tn Bgure 2 as a ^ ^ ^ ^ tf 

inse « and the 3' EcoRI site tn the B» tn e, The ^ ^ ^ ^ ^ ^ ^ 

respe ctive nucleotide positions 3820 and 707 of SEQ ID N 
neomycin insert are located at respective nucleotide posmons 3830 and 6470. 
Ill selected for use, the E4 and hy.omycin resistance g enes were d,ver g ently 

^T^r-encodins cons^ct, prtmers were desi g ned to amplify 

T enomicDNA with the addition of unique BamHI and Nod sues at 

coding region from Ad5 genonnc DNA wtth ^ 

thc 5' and 3' ends of the fragment, respecuvely. The Ad5 on q 

vt u \>i 1 The 5' and 3 primers had the respective uuu 
u ^^u^nk Accession Number Ml oioy. inej<uiu-> F 

5 ATGGGATCCAAGATGAAGCGCGCAAGACCG3- (SEQ ID NO 5) and 
sequences of 5 ATGGG£ILL*A ^ 
vrATAACGCGGCC^£TTCTTTATTCTTGGGC3 (SEQ ID in u 

1, nomrone (BHG) — and rna gene te coding neo.ycn ~ Tne 
; e „« inenrdod in «. cons,™ «— * - * «" ** 



genome. 



"t* compfere nuclide s^nee of pCDNA3/Fiber iS Usred in SEQ ID NO 7 w„ OT 
lhe nuclide posnion , responds ,0 appro**-, d. riddle of *. pcDNA 3 «. 
!:,!„«. The , and , ends of *e loor gene are loeared a, revive nonfood poamona 

9 1 6 with ATG and 266 1 with T A A. 

To e nnance oppression of «b- protein dy tne — CMV proper provr^y 
„ C DNA vector a Bgm fragment conuuning .he tripartite loader (TPL) of adenovnus type t 
the pcDNA vector, a ugiu g inserted into 

was excised from pRDl 12a (Sheay et al., ^:856-862 (1993) and 
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the BamK site of P CDNA3/Fiber to create the plasmid pCLF having 7469 bp, the plasmid 
map of which is shown in Figure 4. The adenovirus tripartite leader sequence, present at the 5' 
end of all major late adenoviral mRNAs as described by Logan et al., Proc. Natl. Acad. Sci., 
USA, 81:3655-3659 (1984) and Berkner, BioTechniaues . 6:616-629 (1988), is encoded by 
three spatially separated exons corresponding to nucleotide positions 6071-6079 (the 3' end of 
the first leader segment), 7101-7172 (the entire second leader segment), and 9634-9721 (the 
third leader segment) in the adenovirus type 2 genome. The tripartite sequence, however, also 
shows correspondence with the Ad5 leader sequence having three spatially separated exons 
corresponding to nucleotide positions 6081-6089 (the 3* end of the first leader segment), 7111- 
71 82 (the entire second leader segment), and 9644-9845 (the third leader segment and 
sequence downstream of that segment). The corresponding cDNA sequence of the tripartite 
leader sequence present in pCLF is listed in SEQ ID NO 8 bordered by BamHI/Bglll 5" and 3' 
sites at respective nucleotide positions 907-912 to 1228-1233. 

The pCLF plasmid has been deposited with the ATCC as described in Example 3. . 
The complete nucleotide sequence of pCLF is listed in SEQ ID NO 8 where the nucleotide 
position 1 corresponds to approximately the middle of the pcDNA 3 parent vector sequence. 
The 5' and 3 ends of the Ad5 fiber gene are located at respective nucleotide positions 1237- 
1239 with ATG and 2980-2982 with TAA. The rest of the vector construct has been 
previously described above. 

C. Generation of an Adenovirus Packaging Cell Line C arrying Plasmids Encoding 
Functional E4 and Fiber Proteins 

The 293 cell line was selected for preparing the first adenovirus packaging line as it 
already contains the El gene as prepared by Graham et al., J. Gen. Virol., 36:59-74 (1977) and 
as further characterized by Spector, Virol., 130:533-538 (1983). Before electroporation, 293 
cells were grown in RPMI medium + 10% fetal calf serum. Four x 10 6 cells were 
electroporated with 20 ug each of pE4/Hygro DNA and pCLF DNA using a BioRad 
GenePulser and settings of 300 V, 25 uF. DNA for electroporation was prepared using the 
Qiagen system according to the manufacturer's instructions (Bio-Rad, Richmond, CA). 

Following electroporation, cells were split into fresh complete DMEM + 10% fetal calf 
serum containing 200 ug/ml Hygromycin B (Sigma, St. Louis, MO). 
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> DNA was isolated using the "MICROTURBOGEN" 
From expanded colonies, genomic DNA was is d ^ 

system (Invitrogen) according to manufacturer's instnicuons^ P-n 

DNA was assessed by PCR using " m m 9) , ^ lattcr of which is a 

(5TGCTTAAGCGGCCGCGAAGGAGAAGTCC3 ) SEQ ^ 

, forward primer near adenovirus 5 open reading frame 6. Refer Fig 

Pn- relative » the » *~ ^ ^ ^ ^ t0 

onedone ' r ^iid«— ^^^"* e 

that seen in parent cell line 293. coniained in 

Genomic Saltan Zoning w,s porfom*d on DNA ^ 

E4 ^ was to 00,00,0, . ~— ^ ^ ^ 
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w ,„ DNA from > ' ; " ^ , he probe hybrito d ,0 a .arge, 
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(ragman, wh,oh ma, bo mo rosul, ^ „„ 

, u u .ho o 1 1 cell line was not selected by neomycin ^ 
Although the 211 cell line ce n line was analyzed for 

„ pressi on of p,«,oin by ^ No imamM was 

, T -— pj--! ,„v„.iprt anti-rabbit IgG (KxW as bcv-un^ : 
antibody and a FTTC-labeiea ami rau B ne 

tu f ,„ venerate 21 1 clones containing recombinant fiber genes, 
detected. Therefore, to generate 211 electrop0 ration with 

was expanded by growing in RPMI medium and subjected 

* fiber-encoding pCLF plasmid as described above. ^ 
Following electroporation, cells were plated in DMEM + 1 Of 

, j tK onn UE /ml G4 1 8 (Gibco, Gaithersburg, MD). Positive c 
colonies were selected with 200 ug/mlG41M were then screened for fiber 

. , Qn) These candidate sublines of 21 1 were uw» => 
remained hygromycin resistant. These 

nn hv indirect immunofluorescence as described above. The tnre 
protein expression , ^mdi rec ^ ^ ^ ^ ^ 

screened, 21 1 A, 21 IB and 21 1R, along w«n 
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staining qualitatively comparable to the positive control of 293 cells infected with AdRSV gal 
(1 pfu/cell) and stained 24 hours post-infection. 

Lines positive for nuclear staining in this assay were then subjected to Western blot 
analysis under denaturing conditions using the same antibody. Several lines in which the 
antibody detected a protein of the expected molecular weight (62 kd for the Ad5 fiber protein) 
were selected for further study including 21 1A, 21 IB and 21 1R. The 21 1 A cell line has been 
deposited with ATCC as described in Example 3. 

Western blot analysis using soluble nuclear extracts from these three cell lines and a 
seminative electrophoresis system demonstrated that the fiber protein expressed is in the 
functional trimeric form characteristic of the native fiber protein as shown in Figure 6. The 
predicted molecular weight of a trimerized fiber is 186 kd. The lane marked 293 lacks fiber 
while the sublines contain detectable fiber. Under denaturing conditions, the trimeric form was 
destroyed resulting in detectable fiber monomers as shown in Figure 6. Those clones 
containing endogenous El, newly expressed recombinant E4 and fiber proteins were selected 
for use in complementing adenovirus gene delivery vectors having the corresponding 
adenoviral genes deleted as described in Example 2. 

D . Pre paration of an El-Expressing Plasmid for Comp le menta tion of El -Gene-Deleted 
Adenoviruses 

In order to prepare adenoviral packaging cell lines other than those based on the El - 
gene containing 293 cell line as described in Example 1C above, plasmid vectors containing El 
alone or in various combinations with E4 and fiber genes are constructed as described below. 

The region of the adenovirus genome containing the El a and El b gene is amplified 
from viral genomic DNA by PCR as previously described. The primers used are E1L, the 5' or 
forward primer, and El R, the 3' or backward primer, having the respective nucleotide 
sequences 5'CCGAGCTAGCGACTGAAAATGAG3' (SEQ ID NO 10) and 
5*CCTCTCGAGAGACAGCAAGACAC3' (SEQ ID NO 1 1). The E1L and E1R primers 
include the respective restriction sites Nhel and Xhol as indicated by the underlines. The sites 
are used to clone the amplified E 1 gene fragment into the Nhel/Xhol sites in pMAM 
commercially available from Clontech (Palo Alto, CA) to form the plasmid pDEX/El having 
1 1 152 bp, the plasmid map of which is shown in Figure 7. 
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proteins in various combinations are also prepared as described below. The resultant plasmids 
are then used in various cell systems with delivery plasmids having the corresponding 
adenoviral gene deletions. The selection of packaging cell, content of the delivery plasmids 
and content of the complementing plasmids for use in generating recombinant adenovirus viral 
vectors of this invention thus depends on whether other adenoviral genes are deleted along 
with the adenoviral fiber gene, and, if so, which ones. 

1. Preparation of a Complementing Plasmi d Containing Fiber and El Adenoviral 

Genes 

A DNA fragment containing sequences for the CMV promoter, adenovirus tripartite 
leader, fiber gene and bovine growth hormone terminator is amplified from pCLF prepared in 
Example IB using the forward primer 5 * G ACGG ATCGGG AG ATCTCC3 * (SEQ ID NO 13), 
that anneals to the nucleotides 1-19 of the pCDNA3 vector backbone in pCLF, and the 
backward primer 5'CCGCCTCAGAAGCCATAGAGCC3* (SEQ ID NO 14) that anneals to 
nucleotides 1278-1257 of the pCDNA3 vector backbone. The fragment is amplified as 
previously described and then cloned into the pDEX/El plasmid, prepared in Example ID. 
For cloning in the DNA fragment, the pDEX/El vector is first digested with Ndel, that cuts at 
a unique site in the pMAM vector backbone in pDEX/El, then the ends are repaired by 
treatment with bacteriophage T4 polymerase and dNTPs. 

The resulting plasmid containing El and fiber genes, designated pEl/Fiber, provides 
both dexamethasone-inducible El function as described for DEX/E1 and expression of Ad5 
fiber protein as described above. A schematic plasmid map of pEl/Fiber, having 14455 bp, is 
shown in Figure 8. 

The complete nucleotide sequence of pEl/Fiber is listed in SEQ ID NO 15 where the 
nucleotide position 1 corresponds to approximately to 1459 nucleotides from the 3' end of the 
parent vector pMAM sequence. The 5' and 3 ends of the Ad5 El gene are located at 
respective nucleotide positions 1460 and 4998 followed by pMAM backbone and then 
separated from the Ad5 fiber from pCLF by the filled-in blunt ended Ndel site. The 5' and 3' 
ends of the pCLF fiber gene fragment are located at respective nucleotide positions 10922- 
14223 containing elements as previously described for pCLF. 

The resultant pEl /Fiber plasmid is then used to complement one or more delivery 

plasmids expressing El and fiber. 
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Tne pE.ZF.ber consuuc, is to used ,o .ansfec, a select hos, cell as descbedm 
Example IE . genera* srab,e chromosomal insertions preformed , previously descrto 
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described in Example 2. 
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expression of .he fiber gene as described for pCLF and E4 mnorion as described for 
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Example 2 

Preparation of Adenoviral Gene Del ivery Vectors 
Using Adenoviral Packag ing? Cell Lines 
Adenoviral delivery vectors of this invention are prepared to separately lack the 
combinations of El/fiber and E4/fiber. Such vectors are more replication-defective than those 
previously in use due to the absence of multiple viral genes. A preferred adenoviral delivery 
vector of this invention that is replication competent but only via a non-fiber means is one that 
only lacks the fiber gene but contains the remaining functional adenoviral regulatory and 
structural genes. Furthermore, the adenovirus delivery vectors of this invention have a higher 
capacity for insertion of foreign DNA. 

A> Preparation of Adenoviral Gene Delivery Vectors Having Specific Gene Deletions and 
Methods of Use 

To construct the El / /fiber deleted viral vector containing the LacZ reporter gene 
construct, two new plasmids were constructed. The plasmid pA ElBp gal was constructed as 

■ 

follows. By digestion of pSV|3 gal (ProMega Corp., Madison, WI) with Vspl, a DNA 
fragment containing the SV40 regulatory sequences and the E. coli -P-galactosidase gene was 
isolated. The resulting fragment having overhanging ends was then filled in with Klenow 
fragment of DNA polymerase 1 in the presence of dNTPs followed by digestion with BamHI. 
The resulting fragment was cloned into the EcoRV and BamHI sites in the polylinker of pA 
ElsplB (Microbix Biosystems, Hamilton, Ontario) to form p A E1B gal that therefore 
contained the left end of the adenovirus genome with the Ela region replaced by the LacZ 
cassette (nucleotides 6690 to 4151) of pSVp gal. Plasmid DNA was prepared by the alkaline 
lysis method as described by Birnboim and Doly, Nuc. Acids Res. . 7:1513-1523 (1978) from 
transformed cells used to expand the plasmid. DNA was then purified by CsCl-ethidium 
bromide density gradient centrifugation. 

The second plasmid (pDV44), prepared as described herein, is derived from pBHGlO, 
a vector prepared a described by Ben et al., Proc. Natl. Acad. ScL, USA , 91:8802-8806 
(1994) and commercially available from Microbix, which contains an Ad5 genome with the 
packaging signals at the left end deleted and the E3 region (nucleotides 28133:30818) replaced 
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ta . preferred embodiment o, mis invention, a delivery plasmid is prepared thadoos 
• 1 above described recombination events to prepare a therapeutic viral vector 
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necessary for packaging but lacking the fiber gene is prepared from plasmid pFG140 
containing full-length Ad85 that is commercially available from Microbix. The resultant 
delivery plasmid referred to as pFG140-f is then used with pCLF stably integrated cells as 
described above to prepare a therapeutic viral vector lacking fiber. In a preferred aspect of this 
invention, the fiber gene is replace with a therapeutic gene of interest for preparing a 
therapeutic delivery adenoviral vector. 

Vectors for the delivery of any desired therapeutic gene are prepared by cloning the 
gene of interest into the multiple cloning sites in the polylinker of commercially available p 
ElsplB (Microbix Biosy stems), in an analogous manner as performed for preparing p.ElB gal 
as described above. The same cotransfection and recombination procedure is then followed as 
described herein to obtain viral gene delivery vectors. 

The recombinant viruses thus produced are used as gene delivery tools both in cultured 
cells and in vivo . For studies of the effectiveness and relative immunogenicity of multiply- 
deleted vectors, virus particles are produced by growth in the packaging lines described in 
Example I and are purified by CsCl gradient centrifugation. Following titering, virus particles 
are administered to mice via systemic or local injection or by aerosol delivery to lung. The 
LacZ reporter gene allows the number and type of cells which are successfully transduced to 
be evaluated. The duration of transgene expression is evaluated in order to determine the 
long-term effectiveness of treatment with multiply-deleted recombinant adenoviruses relative 
to the standard technologies which have been used in clinical trials to date. The immune 
response to the improved vectors described here is determined by assessing parameters such as 
inflammation, production of cytotoxic T lymphocytes directed against the vector, and the 
nature and magnitude of the antibody response directed against viral proteins. 

Versions of the vectors which contain therapeutic genes such as CFTR for treatment of 
cystic fibrosis or tumor suppressor genes for cancer treatment are evaluated in the animal 
system for safety and efficiency of gene transfer and expression. Following this evaluation, 
they are used as experimental therapeutic agents in human clinical trials. 
B. Retargeting of Adenoviral Gene Delivery Vectors bv Producing Viral Particles 

Containing Different or Altered Fiber Proteins 

As the specificity of adenovirus binding to target cells is largely determined by the fiber 
protein, viral particles that incorporate modified fiber proteins or fiber proteins from different 
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of *. native Ad5 fiber protein in adenovrrus paging ceUs as dcscnbed above . aiao 
unliable «> production of different fiber proteins. 
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prions 1 ,o 403, am connected to the Ad3 fiber head regton <3H; rhe nucleoude reg,on 
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fiber shaft-head junction. 
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The resultant chimeric fiber PCR products are then digested with BamHI and Nod for 
separate directional ligation into a similarly digested pcDNA vector. The Ad2 leader sequence 
is then subcloned into the BamHI as described in Example 1 A for preparing an expression 
vector for subsequent transfection into 21 1 cells as described above or into the alternative 
packaging cell systems as previously described. The resultant chimeric fiber construct- 
containing adenoviral packaging cell lines are then used to complement adenoviral delivery 
vectors as previously described. Other fiber chimeric constructs are obtained using a similar 
approach with the various adenovirus serotypes known. 

In an alternative embodiment, the methods of this invention contemplate the use of the 
modified proteins including novel epitopes as described by Michael et al., Gene Therapy , 
2:660-668 (1995) and in International Publication WO 95/26412, the disclosures of which are 
incorporated by reference herein. Both publications describe the construction of a cell-type 
specific therapeutic viral vector having a new binding specificity incorporated into the virus 
concurrent with the destruction of the endogenous viral binding specificity. In particular, the 

< 

authors described the production of an adenoviral vector encoding a gastrin releasing peptide 
(GRP) at the 3' end of the coding sequence of the Ad5 fiber gene. The resulting fiber-GRP 
fusion protein was expressed and shown to assemble functional fiber trimers that were 
correctly transported to the nucleus of HeLa cells following synthesis. 

Based on the teachings in the paper and International Publication, similar constructs are 
contemplated for use in the complementing adenoviral packaging cell systems of this invention 
for generating new adenoviral gene delivery vectors that are replication-deficient and less 
immunogenic. Heterologous ligands contemplated for use herein to redirect fiber specificity 
range from as few as 10 amino acids in size to large globular structures, some of which 
necessitate the addition of a spacer region so as to reduce or preclude steric hindrance of the 
heterologous ligand with the fiber or prevent trimerization of the fiber protein. The ligands are 
inserted at the end or within the linker region. Preferred ligands include those that target 
specific cell receptors or those that are used for coupling to other moieties such as biotin and 
avidin. The types of cell signaling as a result of binding by a ligand is dependent upon the 
specificity of that ligand; U, receptor internalization or lack thereof. 

A preferred spacer includes a short 12 amino acid peptide linker composed of a series 
of serines and alanine flanked by a proline residue at each end. One of ordinary skill in the art 
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is faroili „ * *e of linkers ,o accotnpfisb sufficient prorub, 

-r^nf the fiber protein without compromising the cellular evenis 

tenninus and a, .he ca*ox, .unninus as described Wow ara con,emp,a.ed for use 

. r n-i-flv for the lisand of choice, site-directed 
M tec , bei in One = P ^ ~ ^ ^ ^ a j the 3' and 0, 

' S " !td '° eTp I prlad in Example >. The 3- or amisense 
Ada fiber confer tn pCLF - prepar ^ J aM ^ Se , (SEQ „ NO 

encodes a preferred linker sequenee of ProSerAiaSerAlaaerAta 
a 1» d by a untune resutcdon si.e and >»o srop codons. respec.ively, , *~ rba 
2 o a coding Lance for a solaced heteroiogous ligand and ,0 ensure proper 
tttsenton o, codtn , oIiso „ oc , eot ide inoludcs sequences dra, 

:::: 1 ,0,10.0, — - - — *• »^ - - ^ 

a- „ , r,r P cp1ected licand is obtained, linkers 
ceauences a nucleotide sequence encoding a preselected ligana 

::;„d,n E ,„ *. — * « — - - - — - — - 

teamed corresponding restriction site. 

1M „ tbe resuKan, pCLF vecror confining a Ad5 fib,, gene sequence wtth 
nocl eo,ides encodtng a iinker and a figard. rhe Ad2 leader sequence is insenen as prevtousiy 

td. Tbe resuL fiber-igaod eons.tuc, is to used .0 .ransfec. 3, , oe ,e — 
«„ packaging system previonsly desenbed ,0 produce complements «- veetor paokagurg 
systems for use wirh .be memods of mis invention. 

1„ a further embodiment, fiber proteins encoded by fiber genes 
— serotypes are used inrac, for refaction tnro 2 , , or an a„ema,ive cei, paokagtng 

system as previously described. 

A gene encoding dte fiber protein of inrere, is firs. Coned .0 oreare a pla^ud 
_ 0 pCL, and stab, cel, lines producng una fiber prorein are generated ^rtbad 
above fo, Ada fiber. Tbe adenovirus vecor desenbed which iaoks rbe fiber gene s *en 
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only fiber gene present is the one in the packaging cells, the adenoviruses produced contain 
only the fiber protein of interest and therefore have the binding specificity conferred by the 
complementing protein. Such viral particles are used in studies such as those described above 
to determine their properties in experimental animal systems. 
C. Tarppted Gene. Delivery Usinp Viral Vec tor Particles Lacking Fiber Protein 

An alternative mode of entry for adenoviral infection of hematopoietic cells has been 
described by Huang, et al., J. Virol. . 69:2257-2263 (1995) which does not involve the fiber 
protein-host cell receptor interaction. As infection of most other cell types does require the 
presence of fiber protein, vector particles which lack fiber may preferentially infect 
hematopoietic cells, such as monocytes or macrophages. 

To produce a fiber-free adenovirus vector particle, a vector lacking the fiber gene as 
described above in Example 2A but containing a gene of interest for delivery is amplified by 
growth in cells which do not produce a fiber protein, such as the 21 1 cells prepared in Example 
1 , thereby producing large numbers of particles lacking fiber protein. The recovered fiber-free 
viral panicles are then used to deliver the inserted gene of interest following the methods of 
this invention via targeting mechanisms provided by other regions of the adenoviral vector, i.e., 
via the native penton base. 

Example 3 

Deposit of Materials 

The following cell lines and plasmids have been deposited on September 25, 1996, with 
the American Type Culture Collection, 1301 Parklawn Drive, Rockville, MD, USA (ATCC): 

Material ■ ATCC Accession No. 

Plasmid pE4/Hygro 97739 
Plasmid pCLF 97737 
211 Cell Line CRL-12193 
21 1A Cell Line CRL-12194 



in 



The foregoing written specification is considered to be sufficient to enable one skilled 
the art to practice the invention. The present invention is not to be limited in scope by the 
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, a ™ the deposited embodiment is intended as a single 
cell lines and plasmids deposited, smce the depos ^ 
iUusuation of one aspect of the invenuon and any cell hnes or plasnu 
functionally equivalent are within the scope of this invention. 

The foregoing specification, tncluding the specific embodiments and examples, is 
A dt ^tiveLe present invention and is not to be ta.cn as limiting. Numerous 

scope of the present invention. 

~ Example 4 

i u ~, t t »r al 1994 Characterization of the knob 
The native fiber protein is a homotnmer { Henry LJ. et al 

^ u, T,P= » *- ^ * - — M6 ^ • 

To _ s * — . o. - — - *~ *— " * - ' iKS ' re " S 

were w,.b 50 ^ ft — «*> A. .wo bou. . * • C *- * >»* «*■ - 

„ b „ p ,„oi. - — ec ipte .. - . — ' « « - - °" ** 

Harbour Uoo W «* Spii-S «»bou„ . — - — " *- »+~ 

beads (Pioaoo,. ox,» S ,vo, y wasnod «iib HPA buffo, - -Woo a, - — - - « " 

„ ieIbyUmi »o. p h n, » — — «- — * — °< * 

etarop b„ re aad on . » SDS-PAGE g ol — — * 0. ™ * — b ° 
,„ 5 „unu tt s> 0, «"» SDS in ,o.dbn 8 buffo,. -P.- « — > — ■ 

As s «n in Fig. >, boos 21 IA. 21 IB, and 21 «. bu, no, *o eonaaol 2* cUs. oxpaoasod an 

pr «, pta ,ed ffba, waa — — - - ° f 
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fiber { Wickham T et al 1993 Cell 73:309-319} (the 58 kD Ad2 and 62 kD Ad5 fibers have very 
similar mobilities under these conditions). 

To determine whether the fiber-expressing lines could support the growth of a fiber-defective 
adenovirus, we performed one-step growth experiments using the temperature-sensitive fiber mutant Ad 
H5/J142 (the gift of Harold Ginsberg). At the restrictive temperature (39.5 °C), this mutant produces 
an underglycoslyated fiber protein which is not incorporated into mature virions { Chee-Sheung C. C et 
al 1982 J. Virol 42: 932-950 }. This results in the accumulation of non- infectious viral particles. We 
asked whether the recombinant fiber protein expressed by our cell lines could complement the H5wl42 

defect and rescue viral growth. 

Cell lines 293, 211 A,21 IB and211R (2 x 10 6 cells/sample) were infected with H5tt 142 at 10 
pfu/cell. 48 hours later, cells were detached with 25 mM EDTA and virus was harvested by four rapid 
freeze-thaw cycles. Debris was removed by a 10 minute spin at 1500 x g, and viral titers determined by 
fluorescent focus assay { Thiel J.F et al 1967 Proc. Soc. Exp. Biol. Med. 125:892-895 } on SW480 
cells with a polyclonal anti-penton base Ab { Wickham T et al 1993 Cell 73:309-319}. As shown in 
Fig. 14, the fiber mutant virus replicated to high titers in 293 cells at 32.5° C (the permissive 
temperature), but to a much lower extent at the restrictive temperature of 39.5° C. The fiber-producing 
packaging lines 21 1A, 21 IB, or 21 1R supported virus production at 39.5° C to levels within two- to 
three-fold of those seen at the permissive temperature in 293 cells, indicating that these cells provided 
partial complementation of the fiber defect. 

Interestingly, virus yields from the fiber-producing cell lines were also somewhat higher than 
those from 293 cells at 32.5° C (the 'permissive' temperature). This suggests that fiber produced by the 
tsl42 virus may be partially defective even at the permissive temperature. Alternatively, a non- 
specific increase in adenoviral titer could result when viruses are grown in our packaging cells, by a 
mechanism not involving fiber complementation. However, we have found that viruses with wild type 
fiber genes (such as Ad.RSVbgal) replicate to identical levels either in our packaging lines or in 293 
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demonstrates that the type 5 fiber protein produced by the cells is capable of assembling into complete 
Ad3 particles. 

A vector based on Ad5 but containing the gene for the Ad7 fiber protein has been described { 
Gall J. et al 1996 J. Virol. 70:21 16-2123}, as well as Ads containing chimeric fiber genes {Krasnykh 
V. N et al J. Virol. 70:6839-6846}. Addition of a short peptide linker to the fiber in order to confer 
binding to a different cellular protein has also been reported { 8188 ). By using packaging technology 
such as that presented here, Ad vectors equipped with different fiber proteins may be produced simply 
by growth in cells expressing the fiber of interest, without the time-consuming step of generating a new 
vector genome for each application. 

Replacing or modifying the fiber gene in the vector chromosome would also require that the new 
fiber protein bind a receptor on the surface of the cells it which it is to be grown. The packaging cell 
approach will allow the generation of Ad particles containing a fiber which can no longer bind to its 
host cells, by a single round of growth in cells expressing the desired fiber gene. This will greatly 
expand the repertoire of fiber proteins which can be incorporated into particles, as well as simplifying 
the process of retargeting gene delivery vectors. 

Finally, a novel fiber-independent pathway of infection has recently been described in 
hematopoietic cells, in which penton base provides the initial virus-cell interaction by binding to integrin 
aJ32 { Huang S. et al 1996 J. Virol 70: 4502-4508}. This suggests that viral particles lacking fiber 
protein may be useful in targeting gene delivery to specific cell types via this pathway. 
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SEQUENCE LISTING 



( 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Nemerow, Glen R. 

Von Seggern, Daniel J. 

TITLE OF INVENTION: PACKAGING CELL LINES , ADENOVIRUS 
VECTORS, AND METHODS OF USING SAME 

iii) NUMBER OF SEQUENCES: 20 

fiv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: THE SCRIPPS RESEARCH INSTITUTE 

(B) STREET : 10550 North Torrey Pines Road 

(C) CITY: La Jolla 

(D) STATE: California 

(E) COUNTRY: United States 

(F) ZIP: 92037 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 25-SEP-1996 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Logan, April C. 

(B) REGISTRATION NUMBER: 33,950 

(C) REFERENCE / DOCKET NUMBER: TSRI 554.0 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (619) 554-2937 
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(B) TELEFAX: (619) 554-6312 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
.(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO . 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

CGGTACACAG AATTCAGGAG ACACAACTCC 
30 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) 



MOLECULE TYPE: DNA (genomic) 



(iii) 



HYPOTHETICAL: NO 



(iv) 



ANTI-SENSE: NO 
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(xi) 

GCCTGGATCC 
35 



o SEQU^CE DESCRIPTION: SEQ IDN0:2 



GGGAAGTTAC GTAACGTGGG AAAAC 



(2 ) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3 
CGCGGATCCG CG 
12 



(2) 



INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8710 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 
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( 



xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



CACCTAAATT GTAAGCGTTA ATATTTTGTT AAAATTCGCG TTAAATTTTT GTTAAATCAG 
60 

CTCATTTTTT AACCAATAGG CCGAAATCGG CAAAATCCCT TATAAATCAA AAGAATAGAC 
120 

CGAGATAGGG TTGAGTGTTG TTCCAGTTTG GAACAAGAGT CCACTATTAA AGAACGTGGA 
180 

CTCCAACGTC AAAGGGCGAA AAACCGTCTA TCAGGGCGAT GGCCCACTAC GTGAACCATC 

» * 

240 

ACCCTAATCA AGTTTTTTGG GGTCGAGGTG CCGTAAAGCA CTAAATCGGA ACCCTAAAGG 
300 

GAGCCCCCGA TTTAGAGCTT GACGGGGAAA GCCGGCGAAC GTGGCGAGAA AGGAAGGGAA 
360 

GAAAGCGAAA GGAGCGGGCG CTAGGGCGCT GGCAAGTGTA GCGGTCACGC TGCGCGTAAC 
420 

CACCACACCC GCCGCGCTTA ATGCGCCGCT ACAGGGCGCG TCCCATTCGC CATTCAGGCT 
480 

GCGCAACTGT TGGGAAGGGC GATCGGTGCG GGCCTCTTCG CTATTACGCC AGCTGGCGAA 
540 

AGGGGGATGT GCTGCAAGGC GATTAAGTTG GGTAACGCCA GGGTTTTCCC AGTCACGACG 

600 . 

TTGTAAAACG ACGGCCAGTG AATTGTAATA CGACTCACTA TAGGGCGAAT TGGGTACCGG 
660 

GCCCCCCCTC GAGGTCGACG GTATCGATAA GCTTGATATC GAATTCAGGA GACACAACTC 
720 



WO 98/13499 



-64 



PCT/EP97/05251 



„A CTCTATGTCA — »~~ " " 

780 

„C« ATCCTCTTAC ACTTTTTCAT ACA.GGCCA AGAATAAAGA ATCGTTTGTG 
840 

TTTtT TTTTCAATTG CAGAAAATTT CAAGTCATTT TTCATTCAGT 
TTATGTTTCA ACGTGTTTAT TTTTCAAi i 

900 

AGTATAGCCC CACCACCACA »«* AGATCACCGT ACCTTAATCA AACTCACAGA 

ACCCTAGTAT TCAACC.GCC ACCTCCCTCC CAACACACAG AGTACACAGT CC^CCC 
1020 

irpw CATATCATGG GTAACAGACA TATTCTTAGG TGTTATATTC 
CGGCTGGCCT TAAAAAGCAT CATATCAl^ 

1080 

rmrGCTCA TCAGTGATAT TAATAAACTC CCCGGGCAGC 
CACACGGTTT CCTGTCGAGC CAAACGCTCA TCAG 

1140 

TCACTTAAGT — """"" 

1200 

TGCTTAACGG GCGGCGAAGG AGAAGTCCAC GCC.ACA.GG GGGTAGAGTC ATAATCGTGC 
1260 

„G GGCGGTGGTG CXGCAGCAGC GCGCGAATAA ACTGCTGCCG CCGGGGC.CC 
1320 

r,GCAGTGGTC TCCTCAGCGA TGATTCGCAC CGCCCGCAGC 
GTCCTGCAGG AATACAACAT GGCAGTGGTC 1U 

1380 

„ «-r»rrrrOA TCTCACTTAA ATCAGCACAG 
ATAAGGCGCC TTGTCCTCCG GGCACAGCAG CGCACCCTGA 

1440 

TAACTGCAGC ACAGCACCAC AATATTGTTC AAAA.CCCAC AGTGCAAGGC GCTGTATCCA 
1500 
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AAGCTCATGG CGGGGACCAC AGAACCCACG TGGCCATCAT ACCACAAGCG CAGGTAGATT 
1560 

AAGTGGCGAC CCCTCATAAA CACGCTGGAC ATAAACATTA CCTCTTTTGG CATGTTGTAA 

1620 

TTCACCACCT CCCGGTACCA TATAAACCTC TGATTAAACA TGGCGCCATC CACCACCATC 
1680 

CTAAACCAGC TGGCCAAAAC CTGCCCGCCG GCTATACACT GCAGGGAACC GGGACTGGAA . 
1740 

CAATGACAGT GGAGAGCCCA GGACTCGTAA CCATGGATCA TCATGCTCGT CATGATATCA 
1800 

ATGTTGGCAC AACACAGGCA CACGTGCATA CACTTCCTCA GGATTACAAG CTCCTCCCGC 

1860 

GTTAGAACCA TATCCCAGGG AACAACCCAT TCCTGAATCA GCGTAAATCC CACACTGCAG 
1920 

GGAAGACCTC GCACGTAACT CACGTTGTGC ATTGTCAAAG TGTTACATTC GGGCAGCAGC 

1980 

GGATGATCCT CCAGTATGGT AGCGCGGGTT TCTGTCTCAA AAGGAGGTAG ACGATCCCTA 
2040 

CTGTACGGAG TGCGCCGAGA CAACCGAGAT CGTGTTGGTC GTAGTGTCAT GCCAAATGGA 

* 

2100 

ACGCCGGACG TAGTCATATT TCCTGAAGCA AAACCAGGTG CGGGCGTGAC AAACAGATCT 
2160 

GCGTCTCCGG TCTCGCCGCT TAGATCGCTC TGTGTAGTAG TTGTAGTATA TCCACTCTCT 
2220 

CAAAGCATCC AGGCGCCCCC TGGCTTCGGG TTCTATGTAA ACTCCTTCAT GCGCCGCTGC 
2280 
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„a TC ga=gacgg c caagcacac «-«. 



2340 

C „G AGGGGAGGAG GGGGAAGAGG T GGAAGAAG= ATGTTTTTTT ^GCA 
2400 

PTATTAAGTG AACGCGCTCC CCTCCGGTGG 
AAAGATTATC CAAAACCTCA AAATGAAGAT CTATTAAGTG 

2460 

«AAA -C«C AAAGAACAGA TAATGGCATT — " 
2520 

ZvCTCGACGTA AAGGCTAAAC CCTTCAGGGT 
CTTCCAAAAG GCAAACGGCC CTCACGTCCA AGTGGACGTA 

2580 

G „C TATAAACATT CCAGCACCTT GAAGCA.GCG CAAATAATTC TCATCTCGCG 

2640 

ACCTTCTCAA TATATCTCTA AGCAAATGGC GAATATTAAG TGGGGCGA,, GTAAAAATCT 

HcACAOC GCGCTCCAGG TTCAOCCCA AGCAGCGAAT CA^GCA AAAATTCAGG 
2760 

n.^« GATTCAAAAG CGGAACATTA ACAAAAATAC CGCGATCCCG 
TTCCTCACAG ACCTGTATAA GATTCAAAAG v. 

2820 

rrCAGGGCCA GCTGAACATA ATCGTGCAGG TCTGCACGGA CCAGCGCGGC 
TAGGTCCCTT CGCAGGGCCA 

2880 

CACTTCCCCO GCAGGAAGCT TGACAAAAGA ACCCACAC.G ATTATGACAC GCATACTCGG 
2940 

AGCTATGCTA ACCAGCGTAG CCCCGATGTA AGCTTTGTTG CATGGGCGGC GATATAAAAT 
3000 

GCAAGGTGCT GCTCAAAAAA TCAGGCAAAG CC.CGCGCAA AAAAGAAAGG ACATCGTAGT 
3060 
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CATGCTCATG CAGATAAAGG CAGGTAAGCT CCGGAACCAC CACAGAAAAA GACACCATTT 
3120 

TTCTCTCAAA CATGTCTGCG GGTTTCTGCA TAAACACAAA ATAAAATAAC AAAAAAACAT 
3180 

TTAAACATTA GAAGCCTGTC TTACAACAGG AAAAACAACC CTTATAAGCA TAAGACGGAC 
3240 

TACGGCCATG CCGGCGTGAC CGTAAAAAAA CTGGTCACCG TGATTAAAAA GCACCACCGA 
3300 

CAGCTCCTCG GTCATGTCCG GAGTCATAAT GTAAGACTCG GTAAACACAT CAGGTTGATT 
3360 

CATCGGTCAG TGCTAAAAAG CGACCGAAAT AGCCCGGGGG AATACATACC CGCAGGCGTA 
3420 

GAGACAACAT TACAGCCCCC ATAGGAGGTA TAACAAAATT AATAGGAGAG AAAAACACAT 
3480 

AAACACCTGA AAAACCCTCC TGCCTAGGCA AAATAGCACC CTCCCGCTCC AGAACAACAT 
3 540 

ACAGCGCTTC ACAGCGGCAG CCTAACAGTC AGCCTTACCA GTAAAAAAGA AAACCTATTA 
3600 

AAAAAACACC ACTCGACACG GCACCAGCTC AATCAGTCAC AGTGTAAAAA AGGGCCAAGT 
3660 

GCAGAGCGAG TATATATAGG ACTAAAAAAT GACGTAACGG TTAAAGTCCA CAAAAAACAC 
3720 

CCAGAAAACC GCACGCGAAC CTACGCCCAG AAACGAAAGC CAAAAAACCC ACAACTTCCT 
3780 

CAAATCGTCA CTTCCGTTTT CCCACGTTAC GTAACTTCCC GGATCCGCGG CATTCACAGT 
3840 
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™-, G CTCCAATTCT TGGAGTGGTG AATCCGTTAG CGAGGTGCCG 
TCTCCGCAAG AATTGATTGG CTCCAAT1L 

3900 

*rr* GGTGGCCCGG CTCCATGCAC CGCGACGCAA CGCGGGGAGG 
CCGGCTTCCA TTCAGGTCGA GGTGGCCCW. 

3960 

CM ACAAGG T -GGGCGGC GCCTACAATC CA^CAACC =~ 
4020 

„AAA -CCG.GAC GATCAGCGGT CCAG^CG AAGCTAGGCT GGTAAGAGCC 
4080 

TTGAAGCTG TCCCTGATGG TCGTCATCTA CCTGCCTGGA CAGCATGGCC 
GCGAGCGATC CTTGAAGCTG TCtui* 

4140 

^ r7v , rL > TCATAATGGG GAAGGCCATC 
TGCAACGCGG GCATCCCGAT GCCGCCGGAA GCGAGAAGAA TCATAAT 

4200 

^gcg tcgcgaacgc cagcaagacg tagcccagcg cgtcggccgc catgccctgc 

4260 

recces .ggcccg^ ctcgcg^ ctggcggtgt cecals* aatata^ 

4320 

CATGTCTTTA GTTCTATGAT GACACAAACC CCGCCCAGCG TCTTGTCATT GGCGAATTCG 
4380 

n orrrracGGT CCCAGGTCCA CTTCGCATAT TAAGGTGACG 
AACACGCAGA TGCAGTCGGG GCGGCGCGGT CCCA^ 

4440 

rr» rCGACCCTGC AGCGACCCGC TTAACAGCGT CAACAGCGTG 
CGTGTGGCCT CGAACACCGA GCGACCCTGC 

4500 

arrnTGAACT CACCGCGACG TCTGTCGAGA 
CCGCAGATCC CGGGCAATGA GATATGAAAA AGCCTGAACT 

4560 

AGTTTCTGAT CGAAAAGTTC GACAGCGTCT CCGACCTGAT GCAGCTCTCG GAGGGCGAAG 
4620 
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AATCTCGTGC TTTCAGCTTC GATGTAGGAG 
4680 

GCGCCGATGG TTTCTACAAA GATCGTTATG 
4740 

CGATTCCGGA AGTGCTTGAC ATTGGGGAAT 
4800 

GCCGTGCACA GGGTGTCACG TTGCAAGACC 
4860 

AGCCGGTCGC GGAGGCCATG GATGCGATCG 
4920 

TCGGCCCATT CGGACCGCAA GGAATCGGTC 
4980 

CGATTGCTGA TCCCCATGTG TATCACTGGC 
5040 

CCGTCGCGCA GGCTCTCGAT GAGCTGATGC 
5100 

ACCTCGTGCA CGGGGATTTC GGCTCCAACA 
5160 

CGGTCATTGA CTGGAGCGAG GCGATGTTCG 
5220 

TCTTCTGGAG GCCGTGGTTG GCTTGTATGG 
5280 

ATCCGGAGCT TGCAGGATCG CCGCGGCTCC 
5340 

AACTCTATCA GAGCTTGGTT GACGGCAATT 
5400 
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GGCGTGGATA TGTCCTGCGG GTAAATAGCT 
TTTATCGGCA CTTTGCATCG GCCGCGCTCC 
TCAGCGAGAG CCTGACCTAT TGCATCTCCC 
TGCCTGAAAC CGAACTGCCC GCTGTTCTGC 
CTGCGGCCGA TCTTAGCCAG ACGAGCGGGT 
AATACACTAC ATGGCGTGAT TTCATATGCG 
AAACTGTGAT GGACGACACC GTCAGTGCGT 
TTTGGGCCGA GGACTGCCCC GAAGTCCGGC 
ATGTCCTGAC GGACAATGGC CGCATAACAG 
GGGATTCCCA ATACGAGGTC GCCAACATCT 
AGCAGCAGAC GCGCTACTTC GAGCGGAGGC 
GGGCGTATAT GCTCCGCATT GGTCTTGACC 
TCGATGATGC AGCTTGGGCG CAGGGTCGAT 
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rr^CCGGGA CTGTCGGGCG TACACAAATC GCCCGCAGAA 
GCGACGCAAT CGTCCGATCC GGAGCCGGGA 

5460 

t . rTACT CGC CGATAGTGGA AACCGACGCC 
GCGCGGCCGT CTGGACCGAT GGCTGTGTAG AAGTACTCGC 

5520 

„^rr GGAGATGGGG GAGGCTAACT GAAACACGGA 
CCAGCACTCG TCCGAGGGCA AAGGAATAGG GGAGATG 

5580 

rrr AAGGA ACCCGCGCTA TGACGGCAAT AAAAAGACAG AATAAAACGC 
AGGAGACAAT ACCGGAAGGA ACCCGL 

5640 

ACGGGTGTTG GGTCGTTTGT TCATAAACGC GGGGTTCGGT CCCAGGGCTG GCACTCTGTC 

Taccccac cgagacccca T tggggccaa T acgcccgcg tttcttcctt ttccccaccc 

AGGCCCAGGG CTCGCAGCCA ACGTCGGGGC GGCAGGCCCT 
CACCCCCCAA GTTCGGGTGA AGGCCCAGGG 

5820 

,™ rrGGGTCCCC CATGGGGAAT GGTTTATGGT 
GCCATAGCCA CTGGCCCCGT GGGTTAGGGA CGGGGTCCCC 

5880 

,.r GGCGTTGCGT GGGGTCTGGT CCACGACTGG ACTGAGCAGA 
TCGTGGGGGT TATTATTTTG GGCGTTGCGT G 

5940 

CAGACCCA.G GTTTTTGGAT «~ " ~C=C GACACGAACA 
6000 

C „ T — agcgccaaaa acgaccggog 

6060 

COCC^C CGTCGACCGG TCATGGCTGC GGCCCGAGAC CCGCCAACAC CCGCGACOG 

HOGG GG^GC TCCCGGCATC CGCTTACAGA CAAGCTGTGA CCGTCTCCGG 
6180 
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GAGCTGCATG TGTCAGAGGT TTTCACCGTC 
6240 

TAATCAGCCA TACCACATTT GTAGAGGTTT 
6300 

CCTGAACCTG AAACATAAAA TGAATGCAAT 
6360 

TAATGGTTAC AAATAAAGCA ATAGCATCAC 
6420 

GCATTCTAGT TGTGGTTTGT CCAAACTCAT 
6480 

GTTCTAGAGC GGCCGCCACC GCGGTGGAGC 
6540 

ATTTCGAGCT TGGCGTAATC ATGGTCATAG 
6600 

ACAATTCCAC ACAACATACG AGCCGGAAGC 
6660 

« 

GTGAGCTAAC TCACATTAAT TGCGTTGCGC 
6720 

TCGTGCCAGC TGCATTAATG AATCGGCCAA 
6780 

CGCTCTTCCG CTTCCTCGCT CACTGACTCG 
6840 

GTATCAGCTC ACTCAAAGGC GGTAATACGG 
6900 

AAGAACATGT GAGCAAAAGG CCAGCAAAAG 
6960 
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* 

ATCACCGAAA CGCGCGAGGC AGCCGGATCA 
TACTTGCTTT AAAAAACCTC CCCACCTCCC 
TGTTGTTGTT AACTTGTTTA TTGCAGCTTA 
AAATTTCACA AATAAAGCAT TTTTTTCACT 
CAATGTATCT TATCATGTCT GGATCCACTA 
TCCAGCTTTT GTTCCCTTTA GTGAGGGTTA 
CTGTTTCCTG TGTGAAATTG TTATC CGCTC 
ATAAAGTGTA AAGCCTGGGG TGCCTAATGA 
TCACTGCCCG CTTTCCAGTC GGGAAACCTG 
CGCGCGGGGA GAGGCGGTTT GCGTATTGGG 
CTGCGCTCGG TCGTTCGGCT GCGGCGAGCG 
TTATCCACAG AATCAGGGGA TAACGCAGGA 
GCCAGGAACC GTAAAAAGGC CGCGTTGCTG 
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n.rrc CCCCCCTGAC GAGCATCACA AAAATCGACG CTCAAGTCAG 
GCGTTTTTCC ATAGGCTCCG CCCCCCTWU. * 

7020 

AGGTGGCGAA AGGGGAGAGG ****** — * 
7080 

GTGCGCTCTC G^GGGAG CCTGCCGCTT AGGGCATAGG TGTCCGCCTT TCTCCCTTCG 

CGCTTTCTC, TAGCTCACGC TGTAGGTATC TCAGTTCGGT GTAGGTCGTT 

7200 

CGCTCCAAGC TGGGCTGTGT GCACGAACCC CCCGTTCAGC CCGACCGCTG CGCCTTATCC 

gLaactatc gtc TT gag TC CAACCCGGTA AGAGACGACT TATCGCCACT GGCAGGAGCG 
7320 

ACTGGTAACA GGATTAGCAG AGCGAGGTAT GTAGGCGGTG CTACAGAGTT CTTGAAGTGG 
7380 

TGGCCTAACT ACGGCTACAC TAGAAGGACA GTATTTGGTA TCTGCGCTCT GCTGAAGCCA 
7440 

i^rWST TGGTAGCTCT TGATCCGGCA AACAAACCAC CGCTGGTAGC 
GTTACCTTCG GAAAAAGAGT TGGT AoV- J. u i 

7500 

rrAG CAGAT- ACGCGCAGAA AAAAAGGATC TCAAGAAGAT 
GGTGGTTTTT TTGTTTGCAA GCAGCAGAT. 

7560 

CCTTTGATCT TTTCTACGGG GTCTGACGCT CAGTGGAACG AAAACTCACG TTAAGGGATT 
7620 

TTGGTCATGA GATTATCAAA AAGGATCTTC ACCTAGATCC TTTTAAATTA AAAATGAAGT 
7680 

TTTAAA7CAA TCTAAAGTAT ATATGAGTAA ACTTGGTCTG ACAGTTACCA ATGCTTAATC 
7740 



WO 98/13499 



73 



PCT/EP97/05251 



AGTGAGGCAC CTATCTCAGC GATCTGTCTA 
7800 

GTCGTGTAGA TAACTACGAT ACGGGAGGGC 
7860 

CCGCGAGACC CACGCTCACC GGCTCCAGAT 
7920 

GCCGAGCGCA GAAGTGGTCC TGCAACTTTA 
7980 

CGGGAAGCTA GAGTAAGTAG TTCGCCAGTT 
8040 

ACAGGCATCG TGGTGTCACG CTCGTCGTTT 
8100 

CGATCAAGGC GAGTTACATG ATCCCCCATG 
8160 

* 

CCTCCGATCG TTGTCAGAAG TAAGTTGGCC 
8220 

CTGCATAATT CTCTTACTGT CATGCCATCC 
8280 ■ 

TCAACCAAGT CATTCTGAGA ATAGTGTATG 
8340 

ATACGGGATA ATACCGCGCC ACATAGCAGA 
8400 

TCTTCGGGGC GAAAACTCTC AAGGATCTTA 
8460 

ACTCGTGCAC CCAACTGATC TTCAGCATCT 
8520 



TTTCGTTCAT GCATAGTTGC CTGACTCCCC 



TTACCATCTG GCCCCAGTGC TGCAATGATA 



TTATCAGCAA TAAACCAGCC AGCCGGAAGG 



TCCGCCTCCA TCCAGTCTAT TAATTGTTGC 



AATAGTTTGC GCAACGTTGT TGCCATTGCT 



GGTATGGCTT CATTCAGCTC CGGTTCCCAA 



TTGTGCAAAA AAGCGGTTAG CTCCTTCGGT 



GCAGTGTTAT CACTCATGGT TATGGCAGCA 



GTAAGATGCT TTTCTGTGAC TGGTGAGTAC 



CGGCGACCGA GTTGCTCTTG CCCGGCGTCA 



ACTTTAAAAG TGCTCATCAT TGGAAAACGT 



CCGCTGTTGA GATCCAGTTC GATGTAACCC 



TTTACTTTCA CCAGCGTTTC TGGGTGAGCA 
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,«***r r,GAATAAGGG CGACACGGAA ATGTTGAATA 
AAAACAGGAA GGCAAAATGC CGCAAAAAAG GGAATAAGG 

8580 

«nn* arrRTTTATC AGGGTTATTG TCTCATGAGC 
CTCATACTCT TCCTTTTTCA ATATTATTGA AGCATTTATC A 

8640 

GGATACATRT TTGAATGTAT — ««~~ 

8700 

CGAAAAGTGC 
8710 

{2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

MOLECULE TYPE : DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 5 

ATGGGATCCA AGATGAAGCG CGCAAGACCG 
30 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CATAACGCGG CCGCTTCTTT ATTCTTGGGC 
30 

(2) INFORMATION FOR SEQ ID NO: 7: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7148 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANT I- SENSE: NO 

- 4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 
60 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 
120 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 
180 
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TTAGGGTTAG QCGTTTTGCG ««0» ATGTACGGGC CAGATATACG CGTTGACATT 
240 

GATTATTGAC TAGTTATTAA TAGTAATCAA TTACGGGGTC &TTAGTTCAT AGCCCATATA 
300 

„CCG CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTCACCG CCCAACGACC 
360 

~™**r TTCCCATAGT AACGCCAATA GGGACTTTCC 
CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AA 

420 

~r*r*m *nr<vrrrcA CTTGGCAGTA CATCAAGTGT 
ATTGACGTCA ATGGGTGGAC TATTTACGGT AAACTGCCCA 

480 

~ a vmrrc TAAATGGCCC GCCTGGCATT 
~~fr^r«nnnr rr TA m TGACG TCAATGACGG imai^ 
ATCATATGCC AAGTACGCCC CCTAliw 

540 

^ TCGGACTTTC CTACTTGGCA GTACATCTAC GTATTAGTCA 
ATGCCCAGTA CATGACCTTA TGGGACTTTC 

600 

TCGCTATTAC CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG 
660 

™ggg ATTTCCAAGT CTCCACGCCA ttgacgtcaa tgggagtttg t^ggcacc 

720 

^^^r^nrrrrrr rCCATTGACG CAAATGGGCG 
AAAATCAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC CCCATT 

780 

^r-mr GTCTATATAA GCAGAGCTCT CTGGCTAACT AGAGAACCCA 
GTAGGCGTGT ACGGTGGGAG GTCTATAlA* i*. 

840 

ctgcttactg gcttatcgaa attaatagga gtcactatag ggagaccgaa gcttggtacc 

900 

rrrCGCAAGA CCGTCTGAAG ATACCTTCAA CCCCGTGTAT 
GAGCTCGGAT CCAAGATGAA GCGCGCAAGA 

960 
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CCATATGACA CGGAAACCGG 
1020 

CCCAATGGGT TTCAAGAGAG 
1080 

GTTACCTCCA ATGGCATGCT 
1140 

GGCAACCTTA CCTCCCAAAA 
1200 

AACATAAACC TGGAAATATC 
1260 

GCCGCCGCAC CTCTAATGGT 
1320 

ACCGTGCACG ACTCCAAACT 
1380 

AAGCTAGCCC TGCAAACATC 
1440 

ACTGCCTCAC CCCCTCTAAC 
1500 

ATTTATACAC AAAATGGAAA 
1560 

GACCTAAACA CTTTGACCGT 
1620 

CAAACTAAAG TTACTGGAGC 
1680 



TCCTCCAACT GTGCCTTTTC 



TCCCCCTGGG GTACTCTCTT 



TGCGCTCAAA ATGGGCAACG 



TGTAACCACT GTGAGCCCAC 



TGCACCCCTC ACAGTTACCT 



CGCGGGCAAC ACACTCACCA 



TAGCATTGCC ACCCAAGGAC 



AGGCCCCCTC ACCACCACCG 



TACTGCCACT GGTAGCTTGG 



ACTAGGACTA- AAGTACGGGG 



AGCAACTGGT CCAGGTGTGA 



CTTGGGTTTT GATTCACAAG 



TTACTCCTCC CTTTGTATCC 



TGCGCCTATC CGAACCTCTA 



GCCTCTCTCT GGACGAGGCC 



CTCTCAAAAA AACCAAGTCA 



CAGAAGCCCT AACTGTGGCT 



TGCAATCACA GGCCCCGCTA 



CCCTCACAGT GTCAGAAGGA 



ATAGCAGTAC CCTTACTATC 



GCATTGACTT GAAAGAGCCC 



CTCCTTTGCA TGTAACAGAC 



CTATTAATAA TACTTCCTTG 



GCAATATGCA ACTTAATGTA 



GCAGGAGGAC TAAGGATTGA TTCTCAAAAC AGACGCCTTA TACTTGATGT TAGTTATCCG 
1740 
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AAAAGCAAG, »»™» GCCCTCTTTT ***** 

1800 

rrrrTTTACT TGTTTACAGC TTCAAACAAT 
GCCCACAACT TGGATATTAA CTACAACAAA GGCCTTTACT TG 

1860 

TC CAAAAAGC TTGAG^AA CCTAAGCACT GGCAAGGGGT TGATGTTTGA CGC.ACAGGC 
1920 

„r» ATGCAGGAGA TGGGCTTGAA TTTGGTTCAC CTAATGCACC AAACACAAAT 
1980 

CCCCTCAAAA CAAAAATTGG CCATGGCCTA GAATTTGATT CAAACAAGGC TATGGTTCCT 
2040 

AAACTAGGAA CTGGCCTTAG TTTTGACAGC ACAGGTGGCA TTACAGTAGG AAACAAAAAT 
2100 

AATGATAAGC TAACTTTGTG GACGACACGA GCTCCATCTC CTAACTGTAG ACTAAATGCA 
2160 

GAGAAAGATG CTAAACTCAC TTTGGTCTTA ACAAAATGTG GCAGTCAAAT ACTTGCTACA 
2220 

GTTTCAGTTT TGGCTGTTAA AGGCAGTTTG GCTCCAATAT CTGGAACAGT TCAAAGTGCT 
2280 

CATCTTATTA TAAGATTTGA CGAAAATGGA GTGCTACTAA ACAATTCCTT CCTGGACCCA 
2340 

TGGAGATCTT ACTGAAGGCA CAGCCTATAC AAACGCTGTT 
GAATATTGGA ACTTTAGAAA TGGAGA1 L 

2400 

GGATTTATGC CTAACCTATC AGCTTATCCA AAATCTCACG GTAAAACTGC CAAAAGTAAC 
2460 

™ mrrriOAC AAAACTAAAC CTGTAACACT AACCATTACA 
ATTGTCAGTC AAGTTTACTT AAACGGAGAC AAAACl/wi 

2520 
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CTAAACGGTA CACAGGAAAC AGGAGACACA 
2580 

TGGGACTGGT CTGGCCACAA CTACATTAAT 
2640 

TCATACATTG CCCAAGAATA AAGAAGCGGC 
2700 

CTATAGTGTC ACCTAAATGC TAGAGCTCGC 
2760 

* 

CAGCCATCTG TTGTTTGCCC CTCCCCCGTG 
2820 

ACTGTCCTTT CCTAATAAAA TGAGGAAATT 
2880 

4 

ATTCTGGGGG GTGGGGTGGG GCAGGACAGC 
2940 

CATGCTGGGG ATGCGGTGGG CTCTATGGCT 
3000 

AGGGGGTATC CCCACGCGCC CTGTAGCGGC 
3060 

CGCAGCGTGA CCGCTACACT TGCCAGCGCC 
3120 

TCCTTTCTCG CCACGTTCGC CGGCTTTCCC 
3180 

GGGTTCCGAT TTAGTGCTTT ACGGCACCTC 
3240 

TCACGTAGTG GGCCATCGCC CTGATAGACG 
3300 
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ACTCCAAGTG CATACTCTAT GTCATTTTCA 
GAAATATTTG CCACATCCTC TTACACTTTT 
CGCTCGAGCA TGCATCTAGA GGGCCCTATT 
TGATCAGCCT CGACTGTGCC TTCTAGTTGC 
CCTTCCTTGA CCCTGGAAGG TGCCACTGCC 
GCATCGCATT GTCTGAGTAG GTGTCATTCT 
AAGGGGGAGG ATTGGGAAGA CAATAGCAGG 
TCTGAGGCGG AAAGAAC C AG CTGGGGCTCT 
GCATTAAGCG CGGCGGGTGT GGTGGTTACG 
CTAGCGCCCG CTCCTTTCGC TTTCTTCCCT 
CGTCAAGCTC TAAATCGGGG CATCCCTTTA 
GACCCCAAAA AACTTGATTA GGGTGATGGT 
GTTTTTCGCC CTTTGACGTT GGAGTCCACG 
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g«agaaga= tcaaccca, 

— »~» » TCGGCCTATT G^AAAAAA — 

3420 

rrrm TTAATTCTGT GGAATGTGTG TCAGTTAGGG TGTGGAAAGT 
TAACAAAAAT TTAACGCGAA TTAATTC-l^i 

3480 

„ ^..rriTTC ATCTCAATTA GTCAGCAACC 
~~~-Kr~r- rirtJiGTATG CAAAGCAT(j<- 
CCCCAGGCTC CCCAGGCAGG CAGAAGj-AIVj 

T~ «»» — » GCATCTCAAT 

3600 

„caa cca T ag,ggg gcgcg T aac T ggcgggtaac tccggcgag, 

TCCGCCCATT CTGGGCCGGA TGGCTGACTA ATTTTTTTTA TTTATGCAGA GGCCGAGGCC 
3720 

mr , rr . ri , rT ttTTTGGAGG CCTAGGCTTT 
GCCTCTGCCT CTGAGCTATT CCAGAAGTAG TGAGGAGGCT TTTTTG 

37B0 

TGCAAAAhGC TCCCGGGAGC TTGTATATCC ATTTTCGGAT CTGATCAAGA GACAGGATGA 
3840 

GGATCGTTTC GCATGATTGA ACAAGATGGA TTGCACGCAG GTTCTCCGGC CGCTTGGGTG 
3900 

r.r.*m r&r ACAATCG gctgctctga tgccgccgtg 

GAGAGGCTAT TCGGCTATGA CTGGGCACAA CAGACAATCG 
3960 

„„ „ rrrrrGGTT CTTTTTGTCA AGACCGACCT GTCCGGTGCC 
TTCCGGCTGT CAGCGCAGGG GCGCCCGGTT CTTill^ 

4020 

nr-r-.rrrr^ CTATCGTGGC TGGCCACGAC GGGCGTTCCT 
CTGAATGAAC TGCAGGACGA GGCAGCGCGu CTATCGTLj 

4080 
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TGCGCAGCTG TGCTCGACGT TGTCACTGAA 
4140 

GTGCCGGGGC AGGATCTCCT GTCATCTCAC 
4200 

GCTGATGCAA TGCGGCGGCT GCATACGCTT 
4260 

GCGAAACATC GCATCGAGCG AGCACGTACT 
4320 

GATCTGGACG AAGAGCATCA GGGGCTCGCG 
4380 

CGCATGCCCG ACGGCGAGGA TCTCGTCGTG 
4440 

ATGGTGGAAA ATGGCCGCTT TTCTGGATTC 
4500 

CGCTATCAGG . ACATAGCGTT GGCTACCCGT 
4560 

GCTGACCGCT TCCTCGTGCT TTACGGTATC 
4620 

TATCGCCTTC TTGACGAGTT CTTCTGAGCG 
4680 

CGACGCCCAA CCTGCCATCA CGAGATTTCG 
4740 

GCTTCGGAAT CGTTTTCCGG GACGCCGGCT 
4800 
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GCGGGAAGGG ACTGGCTGCT ATTGGGCGAA 
CTTGCTCCTG CCGAGAAAGT ATCCATCATG 
GATCCGGCTA CCTGCCCATT CGACCACCAA 
CGGATGGAAG CCGGTCTTGT CGATCAGGAT 
CCAGCCGAAC TGTTCGCCAG GCTCAAGGCG 
ACCCATGGCG ATGCCTGCTT GCCGAATATC 
ATCGACTGTG GCCGGCTGGG TGTGGCGGAC 
GATATTGCTG AAGAGCTTGG CGGCGAATGG 
GCCGCTCCCG ATTCGCAGCG CATCGCCTTC 
GGACTCTGGG GTTCGAAATG ACCGACCAAG 
ATTCCACCGC CGCCTTCTAT GAAAGGTTGG 
GGATGATCCT CCAGCGCGGG GATCTCATGC 



TGGAGTTCTT CGCCCACCCC AACTTGTTTA TTGCAGCTTA TAATGGTTAC AAATAAAGCA 
4860 
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, TOlirriT TTT TTTCACT GCATTCTAGT TGTGGTTTGT 
ATAGCATCAC AAATTTCACA AATAAAGCAT TTTTTTCA 

4920 

C— TATCATGTCT ***** 

4980 

CGTAATCATG TTTCCTGTGT GAAATTG^A TCCGCCACA ATTCCACACA 

5040 

^ACGAGC GGGAAGGA.A AAGTGTAAAG CCTGCGGTCC CTAATGAGTG AGCTAACTCA 
5100 

~ mpr.rTP^r AAACCTGTCG TGCCAGCTGC 
CATTAATTGC GTTGCGCTCA CTGCCCGCTT TCCAGTCGGG AAACCT 

5160 

ATTAATGAAT CGGCCAACGC GGGGGGAGAG GCGGTTTGCG TATTGGGCGC TCTTCCGCTT 
5220 

CCTCGCTCAC TGACTCGCTG CGCTCGGTCG TTCGGCTGCG GCGAGCGGTA TCAGC.CACT 
5280 

CAAAGGCGGT AATACGGTTA TCCACAGAAT CAGGGGATAA CGCAGGAAAG AACATGTGAG 
5340 

. *r.n**rrCV& AAAAGGCCGC GTTGCTGGCG TTTTTCCATA 
CAAAAGGCCA GCAAAAGGCC AGGAACCGTA AAAAGL>i_i_ 

5400 

™CCC CCCTGACGAG CATCACAAAA ATCGACGCTC AAGTCAGAGG TGGCGAAACC 
5460 

m *r rarrCGTTTC CCCCTGGAAG CTCCCTCGTG CGCTCTCCTG 
CGACAGGACT ATAAAGATAC CAGGCGTTTC CUXi 

5520 

IT CCGACCCT GCCGCTTACC GGATACCTGT CCGCCTTTCT CCCTTCGGGA AGCGTGGCGC 
5580 

TTTCTCAATG CTCACGCTGT AGGTATCTCA GTTCGGTGTA GGTCGTTCGC TCCAAGCTGG 
5640 
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GCTGTGTGCA CGAACCCCCC GTTCAGCCCG ACCGCTGCGC CTTATCCGGT AACTATCGTC 
5700 

TTGAGTCCAA CCCGGTAAGA CACGACTTAT CGCCACTGGC AGCAGCCACT GGTAACAGGA 
5760 

TTAGCAGAGC GAGGTATGTA GGCGGTGCTA ' CAGAGTTCTT GAAGTGGTGG CCTAACTACG 
5820 

■ 

GCTACACTAG AAGGACAGTA TTTGGTATCT GCGCTCTGCT GAAGCCAGTT ACCTTCGGAA 
5880 

AAAGAGTTGG TAGCTCTTGA TCCGGCAAAC AAACCACCGC TGGTAGCGGT GGTTTTTTTG 
5940 < 

TTTGCAAGCA GCAGATTACG CGCAGAAAAA AAGGATCTCA AGAAGATCCT TTGATCTTTT 
6000 

CTACGGGGTC TGACGCTCAG TGG AAC G AAA ACTCACGTTA AGGGATTTTG GTCATGAGAT 

6060 

TATCAAAAAG ' GATCTTCACC TAGATCCTTT TAAATTAAAA ATGAAGTTTT AAATCAATCT 
6120 

AAAGTATATA TGAGTAAACT TGGTCTGACA GTTACCAATG CTTAATCAGT GAGGCACCTA 
6180 

TCTCAGCGAT CTGTCTATTT CGTTCATCCA TAGTTGCCTG ACTCCCCGTC GTGTAGATAA 
6240 

CTACGATACG GGAGGGCTTA CCATCTGGCC CCAGTGCTGC AATGATACCG CGAGACCCAC 
6300 

GCTCACCGGC TCCAGATTTA TCAGCAATAA ACCAGCCAGC CGGAAGGGCC GAGCGCAGAA 

6360 

GTGGTCCTGC AACTTTATCC GCCTCCATCC AGTCTATTAA TTGTTGCCGG GAAGCTAGAG 
6420 
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*rr-TGTTGC CATTGCTACA GGCATCGTGG 
TAAGTAGTTC GCCAGTTAAT AGTTTGCGCA ACG j. TGTTGC 

6480 

TGTCACGCTC GTCG^GCT TCAGCTCCGG ^CCCAACGA TCAAGGCGAG 

6540 

«**mr CGGTTAGCTC CTTCGGTCCT CCGATCGTTG 
TTACATGATC CCCCATGTTG TGCAAAAAAG CGGTTAGCTC 

6600 

TCAGAAGTAA GTTGGCCGCA GTGTTATCAC TCATGGTTAT GGCAGCACTG CATAA^C 
6660 

TTACTGTCAT GCCAXCCGTA AGATGCTTTT CTGTGACTGG TGAGTACTCA ACCAAGTCAT 
6720 

,n rrirrGRGTT GCTCTTGCCC GGCGTCAATA CGGGATAATA 
TCTGAGAATA GTGTATGCGG CGACCGAGTT GCTtii 

6780 

TCATCATTGG AAAACGTTCT TCGGGGCGAA 
CCGCGCCACA TAGCAGAACT TTAAAAGTGC TCATCA1 l 

6840 

AACTCTCAAG GATCTTACCG CTGTTGAGAT CCAGTTCGAT GTAACCCACT CGTGGACCCA 
6900 

nmw „ f , rl rCGTTTCTGG GTGAGCAAAA ACAGGAAGGC 
ACTGATCTTC AGCATCTTTT ACTTTCACCA GCGTTTCTQ^ 

6960 

^CCGC AAAAAAGGGA ATAAGGGCGA CACGGAAATG TTGAATACTC ATAC.CTTCC 
7020 

TITTI GAATA ttattgaagc atttatcagg gttattgtct catgagcgga tacatatttg 

7080 

mmrrrrrrAC ATTTCCCCGA AAAGTGCCAC 
AATGTATTTA GAAAAATAAA CAAATAGGGG TTCCGCGCAC ATTTC 

7140 



CTGACGTC 
7148 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 

60 ' 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 
120 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 
180 

TTAGGGTTAG GCGTTTTGCG CTGCTTCGCG ATGTACGGGC C AG AT AT AC G CGTTGACATT 
240 

GATTATTGAC TAGTTATTAA TAGTAATCAA TTACGGGGTC ATTAGTTCAT AGCCCATATA 
300 

TGGAGTTCCG CGTTACATAA CTTACGGTAA ATGGCCCGCC TGGCTGACCG CCCAACGACC 
360 

CCCGCCCATT GACGTCAATA ATGACGTATG TTCCCATAGT AACGCCAATA GGGACTTTCC 
420 
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ATTGACGTCA ATGCGTGGAC TATTTACGGT —GCCC* CTTGGCAGTA CATCAAGTCT 



480 



„CATAT=CC AAGTACGCCC CCT.TTa.C3 TCAATGACGG TAAATGGCCC GCCTGGCATT 



540 



MG CCCAGTA CATGACCTTA TGGGACTTTC CTACTTGGCA "«« 



600 



TCGCTATTAC CATGGTGATG CGGTTTTGGC AGTACATCAA TGGGCGTGGA TAGCGGTTTG 



660 



ACTCACGGGG ATTTCCAAGT CTCCACCCCA TTGACGTCAA TGGGAGTTTG TTTTGGCACC 



720 



^CAACG GGACTTTCCA AAATGTCGTA ACAACTCCGC CCCATTGACO CAAATGGGCG 



780 



GTAGGCGTGT ACGGTGGGAG GTCTATATAA GCAGAGCTCT CTGGCTAACT AGAGAACCCA 



840 



CTGCTTACTG GCTTATCGAA ATTAATACGA CTCACTATAG GGAGACCCAA GCTTGGTACC 



900 



GAGCTCGGAT CTGAATTCGA GCTCGCTGTT GGGCTCGCGG TTGAGGACAA ACTCTTCGCG 



960 



GTCTTTCCAG TACTCTTGGA TCGGAAAGCC GTCGGCCTCC GAACGGTACT CCGCCAGCGA 



1020 



GGGACCTGAG CGAGTCCGCA TCGACCGGAT CGGAAAACCT CTCGAGAAAG GCGTCTAACC 



1080 



AGTCACAGTC GCAAGGTAGG CTGAGCACCG TGGCGGGCGG CAGCGGGTGG CGGTCGGGGT 



1140 



TGTTTCTGGC GGAGGTGCTG CTGATGATGT AATTAAAGTA GGCGGTCTTG AGACGGCGGA 



1200 
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TGGTCGAGGT GAGGTGTGGC AGGCTTGAGA 
1260 

GATACCTTCA ACCCCGTGTA TCCATATGAC 
1320 

CTTACTCCTC CCTTTGTATC CCCCAATGGG 
1380 

TTGCGCCTAT CCGAACCTCT AGTTACCTCC 
1440 

GGCCTCTCTC TGGACGAGGC CGGCAACCTT 
1500 

CCTCTCAAAA AAACCAAGTC AAACATAAAC 
1560 

TCAGAAGCCC TAACTGTGGC TGCCGCCGCA 
1620 - 

ATGCAATCAC AGGCCCCGCT AACCGTGCAC 
1680 

CCCCTCACAG TGTCAGAAGG AAAGCTAGCC 
1740 

GATAGCAGTA CCCTTACTAT CACTGCCTCA 
1800 

GGCATTGACT TGAAAGAGCC CATTTATACA 
1860. 

GCTCCTTTGC ATGTAACAGA CGACCTAAAC 
1920 



TCCAAGATGA AGCGCGCAAG ACCGTCTGAA 



ACGGAAACCG GTCCTCCAAC TGTGCCTTTT 



TTTCAAGAGA GTCCCCCTGG GGTACTCTCT 



AATGGCATGC TTGCGCTCAA AATGGGCAAC 



ACCTCCCAAA ATGTAACCAC TGTGAGGCCA 



CTGGAAATAT CTGCACCCCT CACAGTTACC 



CCTCTAATGG TCGCGGGCAA CACACTCACC 



GACTCCAAAC TTAGCATTGC CACCCAAGGA 



CTGCAAACAT CAGGCCCCCT CACCACCACC 



CCCCCTCTAA CTACTGCCAC TGGTAGCTTG 



CAAAATGGAA AACTAGGACT AAAGTACGGG 



ACTTTGACCG TAGCAACTGG TCCAGGTGTG 



ACTATTAATA ATACTTCCTT GCAAACTAAA GTTACTGGAG CCTTGGGTTT TGATTCACAA 
1980 
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^„ ir - 1TTr A TTCTCAAAA CAGACGCCTT 
GGCAATATGC AACTTAATGT AGCAGGAGGA CTAAGGATTG 

2040 

„„.. rr . !ir TAAATCTAAG ACTAGGACAG 
r-TTTGATGCT CAAAACCAAC TA^^ 1 
ATACTTGATG TTAGTTATCC GTTTGAl^u 

2100 

«*• RGCCCACAAC TTGGATATTA ACTACAACAA AGGCCTTTAC 
GGCCCTCTTT TTATAAACTC AGCCCACAAC 

2160 

r TTGAGGTTA ACCTAAGCAC TGCCAAGGGG 
TTGTTTACAG CTTCAAACAA TTCCAAAAAG C TTGAGGTTA 

2220 

mm r^-rrrfcGGAG ATGGGCTTGA ATTTGGTTCA 
TTGATGTTTG ACGCTACAGC CATAGCCATT AATGCAGGAG AT 

2280 

CCTAATGCAC CAAACACAAA TCCCCTCAAA AGAAAAA.TG ™- AOAAT^AT 
2340 

mrrl arTGGCCTTA GTTTTGACAG CACAGGTGCC 
TCAAACAAGG CTATGGTTCC TAAACTAGGA ACTGGCCTTA 

2400 

„™**r PTAACTTTGT GGACCACACC AGCTCCATCT 
ATTACAGTAG GAAACAAAAA TAATGATAAG CTAACTTTGT 

2460 

CCTAACTGTA GACAAATGC """"" 

mLokw* tacttgc T ac agtttcagtt t T ggc T g™ aaggcagttt 

2580 

n^AGTGC TCATCTTATT ATAAGATTTG ACGAAAATGG AGTGCTACTA 
TCTGGAACAG TTCAAAGTGC TLAiui^ 

2640 

„CCT TCCTGGACCC AGAATATTGG AACTTTAGAA ATGGAGATCT TACTGAAGGC 
2700 

ACAGCCTATA CAAACGCTGT TGGATTTATG CCTAACCTAT CAGCTTATCC AAAATCTCAC 
2760 
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GGTAAAACTG CCAAAAGTAA CATTGTCAGT 
2820 

CCTGTAACAC TAACCATTAC ACTAAACGGT 
2880 

GCATACTCTA TGTCATTTTC ATGGGACTGG 
2940 

GCCACATCCT CTTACACTTT TTCATACATT 
3000 

ATGCATCTAG AGGGCCCTAT TCTATAGTGT 
3060 

TCGACTGTGC CTTCTAGTTG CCAGCCATCT 
3120 

ACCCTGGAAG GTGCCACTCC CACTGTCCTT 
3180 

TGTCTGAGTA GGTGTCATTC TATTCTGGGG 
3240 

GATTGGGAAG ACAATAGCAG GCATGCTGGG 
3300 

GAAAGAACCA GCTGGGGCTC TAGGGGGTAT 
3360 

GCGGCGGGTG TGGTGGTTAC GCGCAGCGTG 
3420 

GCTCCTTTCG CTTTCTTCCC TTCCTTTCTC 
3480 

CTAAATCGGG GCATCCCTTT AGGGTTCCGA 
3540 
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CAAGTTTACT TAAACGGAGA CAAAACTAAA 
ACACAGGAAA CAGGAGAGAC AACTCCAAGT 
TCTGGCCACA ACTACATTAA TGAAATATTT 
GCCCAAGAAT AAAGAAGCGG CCGCTCGAGC 
CACCTAAATG CTAGAGCTCG CTGATCAGCC 
GTTGTTTGCC CCTCCCCCGT GCCTTCCTTG 
TCCTAATAAA ATGAGGAAAT TGCATCGCAT 
GGTGGGGTGG GGCAGGACAG CAAGGGGGAG 
GATGCGGTGG GCTCTATGGC TTCTGAGGCG 
CCCCACGCGC CCTGTAGCGG CGCATTAAGC 
ACCGCTACAC TTGCCAGCGC CCTAGCGCCC 
GCCACGTTCG CCGGCTTTCC CCGTCAAGCT 
TTTAGTGCTT TACGGCACCT CGACCCCAAA 
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„ TT ,=0— TTCACGTAGT GGGCCATCGC CC~ «m«OC 
3600 

CCITTC ACG T « GTTCTTTAAT AGTGGACTCT ««« TGGAACAACA 

ricC.A TCTCGGTCTA TTCTTTTGAT TTATAAGGGA TTTTGGGGAT TTCGGCCTAT 
3720 

TGGTTAAAAA ATGAGCTGAT TTAACAAAAA TTTAACGCGA ATTAATTCTG TGGAATGTGT 
3780 

rrnifl TCCCCAGGCT CCCCAGGCAG GCAGAAGTAT GCAAAGCATG 
GTCAGTTAGG GTGTGGAAAG TCCCCAGbLl 

3840 

CATCTCAATT AGTCAGCAAC CAGGTGTGGA AAGTCCCCAG GCTCGCCAGC AGGCAGAAGT 

Zaaagca tgcatctcaa ™gca ACCATAGTCC gccccctaac TCGGCCCATC 
TgIccaa ctccgcgcag ttccgcgca, tgtccgcccc atggctgact aat™ 

4020 

^.rrr rrCCTCTGCC TCTGAGCTAT TCCAGAAGTA GTGAGGAGGC 
ATTTATGCAG AGGCCGAGGC CGCCTCTGCL 

4080 

,rrT AG GCTT TTGCAAAAAG CTCCCGGGAG CTTGTATATC CATTTTCGGA 
TTTTTTGGAG GCCTAGGCTT iibLrt^ 

4140 

tc tgatcaag agacaggatg aggatcgttt cgcatgattg aacaagatgg attgcacgca 

4200 

ggttctccgg ccgcttgggt ggagaggcta T tc=gc™g actgggcaca acagacaatc 

4260 

ggctgctctg atgccgccg, gttccggcg tcagcgcagg ggcgcccgg, ™g,c 

4320 
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AAGACCGACC TGTCCGGTGC 
4380 

CTGGCCACGA CGGGCGTTCC 
4440 

GACTGGCTGC TATTGGGCGA 
4500 

GCCGAGAAAG TATCCATCAT 
4560 

ACCTGCCCAT TCGACCACCA 
4620 

GCCGGTCTTG TCGATCAGGA 
4680 

CTGTTCGCCA GGCTCAAGGC 
4740 

GATGCCTGCT TGCCGAATAT 
4800 

GGCCGGCTGG GTGTGGCGGA 
4860 

GAAGAGCTTG GCGGCGAATG 
4920 

GATTCGCAGC GCATCGCCTT 
4980 

GGTTCGAAAT GACCGACCAA 
5040 

CCGCCTTCTA TGAAAGGTTG 
5100 
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CCTGAATGAA CTGCAGGACG 



TTGCGCAGCT GTGCTCGACG 



AGTGCCGGGG CAGGATCTCC 



GGCTGATGCA ATGCGGCGGC 



AGCGAAACAT CGCATCGAGC 



TGATCTGGAC GAAGAGCATC 



GCGCATGCCC GACGGCGAGG 



CATGGTGGAA AATGGCCGCT 



CCGCTATCAG GACATAGCGT 



GGCTGACCGC TTCCTCGTGC 



CTATCGCCTT CTTGACGAGT 



GCGACGCCCA ACCTGCCATC 



GGCTTCGGAA TCGTTTTCCG 
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AGGCAGCGCG GCTATCGTGG 



TTGTCACTGA AGCGGGAAGG 



TGTCATCTCA CCTTGCTCCT 



TGCATACGCT TGATCCGGCT 



GAGCACGTAC TCGGATGGAA 



AGGGGCTCGC GCCAGCCGAA 



ATCTCGTCGT GACCCATGGG 



TTTCTGGATT CATCGACTGT 



TGGCTACCCG TGATATTGCT 



TTTACGGTAT CGCCGCTCCC 



TCTTCTGAGC GGGACTCTGG 



ACGAGATTTC GATTCCACCG 



GGACGCCGGC TGGATGATCC 
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CTGGAGTTCT TCGCCCACCC CAACTTGTTT ATTGCAGCTT 
TCCAGCGCGG GGATCTCATG CTGGAGTTCT 

5160 

~.. TTTC AC AAATAAAGCA TTTTTTTCAC 
ATAATGGTTA CAAATAAAGC AATAGCATCA CAAATTTCAC AAA 

5220 

«nn* TrAATGTATC TTATCATGTC TGTATACCGT 
TGCATTCTAG TTGTGGTTTG TCCAAACTCA TCAATGTATC 

5280 

rwr GCGTAATCAT GGTCATAGCT GTTTCCTGTG TGAAATTGTT 
CGACCTCTAG CTAGAGCTTG GCGTAATCA1 to 

5340 

5rir aaCATACGAG CCGGAAGCAT AAAGTGTAAA GCCTGGGGTG 
ATCCGCTCAC AATTCCACAC AACATACGAG u 

5400 

CC „ T ~c,c ™- c^c ,c TC ccc OT ^cco 

5460 

lW<1 . TPOGCCAACG CGCGGGGAGA GGCGGTTTGC 
GAAACCTGTC GTGCCAGCTG CATTAATGAA TCGGCCAACG 

5520 

g „cg c^cccc, rcc^c™ «««« 

5580 

.pnmPOCGG TAATACGGTT ATCCACAGAA TCAGGGGATA 
GGCGAGCGGT ATCAGCTCAC TCAAAGGCGG TAATAC 

5640 

„ irrpc AGCAAAAGGC CAGGAACCGT AAAAAGGCCG 
ACGCAGGAAA GAACATGTGA GCAAAAGGCC AGCAAAAG 

5700 

™ G *C GTTTTTCCAT ««C« CCCC^CG* ***** — * 

GTGGCGAAAC CCC^C — > «— «- CCCCCTO^ 

5820 

GCTCCCTCGT GCGC.CTCCT ~CCC «»»C™ «»«™ 

5880 
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TCCCTTCGGG AAGCGTGGCG CTTTCTCAAT GCTCACGCTG TAGGTATCTC AGTTCGGTGT 
5940 

AGGTCGTTCG CTCCAAGCTG GGCTGTGTGC ACGAACCCCC CGTTCAGCCC GACCGCTGCG 
6000 

CCTTATCCGG TAACTATCGT CTTGAGTCCA ACCCGGTAAG ACACGACTTA TCGCCACTGG 
6060 

CAGCAGCCAC TGGTAACAGG ATTAGCAGAG CGAGGTATGT AGGCGGTGCT ACAGAGTTCT 
6120 

TGAAGTGGTG GCCTAACTAC GGCTACACTA GAAGGACAGT ATTTGGTATC TGCGCTCTGC 
6180 

TGAAGCCAGT TACCTTCGGA AAAAGAGTTG GTAGCTCTTG ATCCGGCAAA CAAACCACCG 
6240 

* 

CTGGTAGCGG TGGTTTTTTT GTTTGCAAGC AGCAGATTAC GCGCAGAAAA AAAGGATCTC 
6300 

AAGAAGATCC TTTGATCTTT TCTACGGGGT CTGACGCTCA GTGGAACGAA AACTCACGTT 
6360 

AAGGGATTTT GGTCATGAGA TTATCAAAAA GGATCTTCAC CTAGATCCTT TTAAATTAAA 

■ * 

6420 

AATGAAGTTT TAAATCAATC TAAAGTATAT ATGAGTAAAC TTGGTCTGAC AGTTACCAAT 
6480 

GCTTAATCAG TGAGGCACCT ATCTCAGCGA TCTGTCTATT TCGTTCATCC ATAGTTGCCT 
6540 

GACTCCCCGT CGTGTAGATA ACTACGATAC GGGAGGGCTT ACCATCTGGC CCCAGTGCTG 
6600 

CAATGATACC GCGAGACCCA CGCTCACCGG CTCCAGATTT ATCAGCAATA AACCAGCCAG 
6660 
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,™r r&ACTTTATC CGCCTCCATC CAGTCTATTA 
CCGGAAGGGC CGAGCGCAGA AGTGGTCCTG CAACTTTATC 

6720 

ATTGTTGCCG GGAAGCTAGA GTAAGTAGTT CGCCAGTTAA TAGTTTGCGC AACGTTGTTG 
6780 

cc „ AGGCATCGTG GTGTCACGCT CGTCGTTTGG TATGGCTTCA TTCAGCTCCG 
6840 

rrrM GT m ACATGAT CCCCCATGTT GTGCAAAAAA GCGGTTAGCT 
GTTCCCAACG ATCAAGGCGA GTxACATGAl 

6900 

m r-TrbPRAGTA AGTTGGCCGC AGTGTTATCA CTCATGGTTA 
CCTTCGGTCC TCCGATCGTT GTCAGAAGTA ACT I 

6960 

TGGCAGCACT GCATAATTCT CTTACTGTCA TGCCATCCGT AAGATGCTTT TCTGTGACTG 
7020 

GTGAGTACTC AACCAAGTGA TTCTGAGAAT AGTGTATGCG GCGACCGAGT TGCTCTTGCC 
7080 

CGGCGTCAAT ACGGGATAAT ACCGCGCGAG ATAGCAGAAC TTTAAAAGTG CTCATCA1TG 
7140 

^ rn , rr A.TCTTACC GCTGTTGAGA TCCAGTTCGA 
GAAAACGTTC TTCGGGGCGA AAACTCTCAA GGATC iTACC 

7200 

*nr*CCC AACTGATCTT CAGCATCTTT TACTTTCACC AGCGTTTCTG 
TGTAACCCAC TCGTGCACCC AACTGAIUH 

7260 

„, T , urrr AATAAGGGCG ACACGGAAAT 
7\ r* PAAAATGCCG CAAAAAAGGG AAiAA^ 
GGTGAGCAAA AACAGGAAGG CAAMioll 

7320 

GTTGAATACT CATACTCTTC «»» ™° ™~ 
7380 

TCATGAGCGG ATACATATTT GAATCTATTT AGAAAAATAA ACAAATAGGG GTTCCGCGCA 
7440 
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CATTTCCCCG AAAAGTGCCA CCTGACGTC 
7469 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TGCTTAAGCG GCCGCGAAGG AGAAGTCC 
28 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



CCGAGCTAGC GACTGAAAAT GAG 
23 



(2) 



INFORMATION FOR SEQ ID HOsU: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(i i> MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 11 

CCTCTCGAGA GACAGCAAGA CAC 
23 

(2 ) INFORMATION FOR SEQ ID NO: 12: 

(i ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 



(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

AAGCTTGGGC AGAAATGGTT GAACTCCCGA GAGTGTCCTA CACCTAGGGG AGAAGCAGCC 

60 • . 

AAGGGGTTGT TTCCCACCAA GGACGACCCG TCTGCGCACA AACGGATGAG CCCATCAGAC 
120 

AAAGACATAT TCATTCTCTG CTGCAAACTT GGCATAGCTC TGCTTTGCCT GGGGCTATTG 
180 

GGGGAAGTTG CGGTTCGTGC TCGCAGGGCT CTCACCCTTG ACTCTTTTAA TAGCTCTTCT 
240 

GTGCAAGATT ACAATCTAAA CAATTCGGAG AACTCGACCT TCCTCCTGAG GCAAGGACCA 
300 

CAGCCAACTT CCTCTTACAA GCCGCATCGA TTTTGTCCTT CAGAAATAGA AATAAGAATG 
360 

CTTGCTAAAA ATTATATTTT TACCAATAAG ACCAATCCAA TAGGTAGATT ATTAGTTACT 
420 

ATGTTAAGAA ATGAATCATT ATCTTTTAGT ACTATTTTTA CTCAAATTCA GAAGTTAGAA 
480 

ATGGGAATAG AAAATAGAAA GAGACGCTCA ACCTCAATTG AAGAACAGGT GCAAGGACTA 
540 

TTGACCACAG GCCTAGAAGT AAAAAAGGGA AAAAAGAGTG TTTTTGTCAA AATAGGAGAC 
600 

AGGTGGTGGC AACCAGGGAC TTATAGGGGA CCTTACATCT ACAGACCAAC AGATGCCCCC 
660" 

TTACCATATA CAGGAAGATA TGACTTAAAT TGGGATAGGT GGGTTACAGT CAATGGCTAT 
720 
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„ ATAGA.CGC, CCCTTTTCGT GAAAGACTCG GCAGAG^AG 

!L«« «M«* — C ATGAAACAAC AGG^ACATGA — « 
840 

W11 . r ATTTTCCATA CCAAGGAGGG GACAGTGGCT 
CTAGGAACAG GAATGCACTT TTGGGGAAAG ATTTTCCAT 

900 

GGACTAATAG AACA^TC TGCAAAAAC, CA^ -™ ATAGCCTTTA 
960 

TTSGC CCAAC CCAGGGCTTA AGTAAGTTTT TGGTTACAAA CTGTTCTTAA 

1020 

AACGAGGATG TGAGACAAGT GGTTTCCTGA CTTGGTTTGG TATCAAAGGT TCTGATCTGA 
1080 

GCTCTGAGTG TTCTATTTTC CTATGTTCTT TTGGAATTTA TCCAAATCTT ATGTAAATGC 
1140 

TTATGTAAAC CAAGATATAA AAGAGTGCTG ATTTTTTGAG TAAAC^CA ACAGTCCTAA 
1200 

CATTCACCTC TTGTGTGTTT GTGTCTGTTC GCCATCCCGT CTCCGCTCGT CACTTATCCT 
1260 

«~.r> rrrmcCC? CAGGTCGGCC GACTGCGGCA 
TCACTTTCCA GAGGGTCCCC CCGCAGACCC CGGCGACCCT CA^ 

1320 

GCTGGCGCCC GAACAGGGAC CCTCGGATAA GTGACCCTTG TCTCTATTTC TACTATTTGG 
1380 

TGTTTGTCTT GTATTGTCTC TTTCTTGTCT GGCTATCATC ACAAGAGCGG AACGGAGTCA 
1440 

aw AGACATATTA TCTGCCACGG AGGTGTTATT 
CCATAGGGAC CAAGCTAGCG ACTGAAAATG AGACATATTA 

1500 
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ACCGAAGAAA TGGCCGCCAG TCTTTTGGAC CAGCTGATCG AAGAGGTACT GGCTGATAAT 
1560 

CTTCCACCTC CTAGCCATTT TGAACCACCT ACCCTTCACG AACTGTATGA TTTAGACGTG 
1620 

ACGGCCCCCG AAGATCCCAA CGAGGAGGCG GTTTCGCAGA TTTTTCCCGA CTCTGTAATG 
1680 

TTGGCGGTGC AGGAAGGGAT TGACTTACTC ACTTTTCCGC CGGCGCCCGG TTCTCCGGAG 
1740 

CCGCCTCACC TTTCCCGGCA GCCCGAGCAG CCGGAGCAGA GAGCCTTGGG TCCGGTTTCT 
1800 

ATGCCAAACC TTGTACCGGA GGTGATCGAT CTTACCTGCC ACGAGGCTGG CTTTCCACCC 

■ 

1860 

AGTGACGACG AGGATGAAGA GGGTGAGGAG TTTGTGTTAG ATTATGTGGA GCACCCCGGG 
1920 

CACGGTTGCA GGTCTTGTCA TTATCACCGG AGGAATACGG GGGACCCAGA TATTATGTGT 
1980 

TCGCTTTGCT ATATGAGGAC CTGTGGCATG TTTGTCTACA GTAAGTGAAA ATTATGGGCA 
2040 

GTGGGTGATA GAGTGGTGGG TTTGGTGTGG TAATTTTTTT TTTAATTTTT ACAGTTTTGT 
2100 

GGTTTAAAGA ATTTTGTATT GTGATTTTTT TAAAAGGTCC TGTGTCTGAA CCTGAGCCTG 
2160 

AGCCCGAGCC AGAACCGGAG CCTGCAAGAC CTACCCGCCG TCCTAAAATG GCGCCTGCTA 
2220 

TCCTGAGACG CCCGACATCA CCTGTGTCTA GAGAATGCAA TAGTAGTACG GATAGCTGTG 
2280 



WO 98/13499 



100 



PCT/EP97/05251 



ACTC CGG T CC TTCTAACACA CCTCCTGAGA 

CGTGAGAGTT ~=TC GCCAGGCTG, GGAATGTATC GAGGACTTGC 

2400 

TI AAGGAGCC TGGGCAACCT TTGGACTTGA GCTGTAAACG CCCCAGGCCA TAAGGTGTAA 
2460 

r., ^irrrcT? TGTTTGCTGA ATGAGTTGAT GTAAGTTTAA 
ACCTGTGATT GCGTGTGTGG TTAACGCCTT TGTllo 

2520 

TAAAGGGTGA GATAATGTTT AACTTGCATG GCGTGTTAAA TGGGGCGGGG CTTAAAGGGT 
2580 

ATATAATGCG CCGTGGGCTA ATCTTGGTTA CATCTGACCT CATGGAGGCT TGGGAGTGTT 
2640 

TGGAAGATTT TTCTGCTGTG CGTAACTTGC TGGAAGAGAG CTCTAACAGT ACCTCTTGGT 
2700 

TTTGGAGGTT TCTGTGGGGC TCATCCGAGG CAAAGTTAGT CTGCAGAATT AAGGAGGATT 
2760 

ACAAGTGGGA ATTTGAAGAG CTTTTGAAAT CCTGTGGTGA GCTGTTTGAT TCTTTGAATC 
2820 

„,^ r ., rr TP A/TCAAGAC TTTGGATTTT TCCACACCGG 
TGGGTCACCA GGCGCTTTTC CAAGAGAAGo TCATCAAGAC 

2880 

GGCGCGCTGC GGCTGCTGTT GCTTTTTTGA GTTTTATAAA GGATAAATGG AGCGAAGAAA 
2940 

CCGATCTGAG CGGGGGGTAC CTGCTGGATT TTCTGGCCAT GCATCTGTGG AGAGCGGTTG 
3000 

TGAGACACAA GAATCGCCTG CTACTGTTGT CTTCCGTCCG CCCGGCGATA ATACCGACGG 
3060 
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AGGAGCAGCA GCAGCAGCAG 
3120 

ACCCGAGAGC CGGCCTGGAC 
3180 

AGAACTGAGA CGCATTTTGA 
3240 

GGAGCGGGGG GCTTGTGAGG 
3300 

CAGACACCGT CCTGAGTGTA 
3360 

TGATCTGCTG GCGCAGAAGT 
3420 

GGATGATTTT GAGGAGGCTA 
3480 

GTACAAGATC AGCAAACTTG 
3540 

CGAGGTGGAG ATAGATACGG 
3600 

GCCGGGGGTG CTTGGCATGG 
3660 

TTTTAGCGGT ACGGTTTTCC 
3720 

TGGGTTTAAC AATACCTGTG 
3780 

TTACTGCTGC TGGAAGGGGG 
3840 
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GAGGAAGCCA GGCGGCGGCG 



CCTCGGGAAT GAATGTTGTA 



CAATTACAGA GGATGGGCAG 



CTACAGAGGA GGCTAGGAAT 



TTACTTTTCA ACAGATCAAG 



ATTCCATAGA GCAGCTGACC 



TTAGGGTATA TGCAAAGGTG 



TAAATATCAG GAATTGTTGC 



AGGATAGGGT GGCCTTTAGA 



ACGGGGTGGT TATTATGAAT 



TGGCCAATAC CAACCTTATC 



TGGAAGCCTG GACCGATGTA 



TGGTGTGTCG CCCCAAAAGC 
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GCAGGAGCAG AGCCCATGGA 



CAGGTGGCTG AACTGTATCC 



GGGCTAAAGG GGGTAAAGAG 



CTAGCTTTTA GCTTAATGAC 



GATAATTGCG CTAATGAGCT 



ACTTACTGGC TGCAGCCAGG 



GCACTTAGGC CAGATTGCAA 



TACATTTCTG GGAACGGGGC 



TGTAGCATGA- TAAATATGTG 



GTAAGGTTTA CTGGCCCCAA 



CTACACGGTG TAAGCTTCTA 



AGGGTTCGGG GCTGTGCCTT 



AGGGCTTCAA TTAAGAAATG 
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CCTCTTTGAA AGGTGTACG, TGGGTATCCT GTCTGAGGGT AACTCCAGGG TGCGCCACAA 
3900 

TGTGGCCTCC GACTGTGGTT GCTTCATGCT AGTGAAAAGC GTGGCTGTGA TTAAGCATAA 
3960 

srrirLrGGC CTCTCAGATG CTGACCTGCT CGGACGGCAA 
CATGGTATGT GGCAACTGCG AGGACAGGGC CTCTCA^ 

4020 

„ m „ r.rrcACTCT CGCAAGGCCT GGCCAGTGTT 
CTGTCACCTG CTGAAGACCA TTCACGTAGC CAGCCACTCT ^ 

4080 

TGAGCATAAC ATACTGACCC GG^CCT, GCATTTGGGT AACAGGAGGG GGGTGTTCCT 
4140 

r a T UTTGCTT GAGCCCGAGA GCATGTCCAA 
ACCTTACCAA TGCAATTTGA GTCACACTAA GATATTGCTT G«. 

4200 

GGTGAACCTG AACGGGQTGT TTGACATGAC CATGAAGATC TGGAAGGTGC TGAGGTACGA 
4260 

^ rr , r ACCCTGCGA GTGTGGCGGT AAACATATTA GGAACCAGCC 
TGAGACCCGC ACCAGGTGCA GACCCTGCGA bio 

4320 

*^ mr «*n rrrrr atCAC TTGGTGCTGG CCTGCACCCG 
TGTGATGCTG GATGTGACCG AGGAGCTGAG GCCCGATCAC Tl 

4380 

CGCTGAGTTT GGCTCTAGCG ATGAAGATAC AGATTGAGGT ACTGAAATGT GTGGGCGTGG 
4440 

CTTAAGGGTG GGAAAGAATA TATAAGGTGG GGGTCTTATG TAGTTTTGTA TCTGTTTTGC 
4500 

AGCAGCCGCC GCCGCCATGA GCACCAACTC GTTTGATGGA AGCATTGTGA GCTCATATTT 
4560 

GACAACGCGC A^CCGCCAT GGGCCGGGGT GCGTCAGAAT GTGATGGGCT CCAGCATTGA 
4620 
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TGGTCGCCCC GTCCTGCCCG CAAACTCTAC 
4680 

GCCGTTGGAG ACTGCAGCCT CCGCCGCCGC 
4740 

TGTGACTGAC TTTGCTTTCC TGAGCCCGCT 
4800 ' 

CCGCGATGAC AAGTTGACGG CTCTTTTGGC 
4860 

TGTCGTTTCT CAGCAGCTGT TGGATCTGCG 
4920 

CCCTCCCAAT GCGGTTTAAA ACATAAATAA 
4980 

GCAAGTGTCT TGCTGTCTCT CGAGGGATCT 
5040: 

CATAATTGGA CAAACTACCT ACAGAGATTT 
5100 

GTGTATAATG TGTTAAACTA CTGATTCTAA 
5160 

GAACTGATGA ATGGGAGCAG TGGTGGAATG 
5220' 

AAGAAATGCC ATCTAGTGAT GATGAGGCTA 
5280 

AAAAGAAGAG AAAGGTAGAA GACCCCAAGG 
5340 

GTCATGCTGT GTTTAGTAAT AGAACTCTTG 
5400 



TACCTTGACC TACGAGACCG TGTCTGGAAC 



TTCAGCCGCT GCAGCCACCG CCCGCGGGAT 



TGCAAGCAGT GCAGCTTCCC GTTCATCCGC 



ACAATTGGAT TCTTTGACCC GGGAACTTAA 



CCAGCAGGTT TCTGCCCTGA AGGCTTCCTC 



AAAACCAGAC TCTGTTTGGA TTTGGATCAA 



TTGTGAAGGA ACCTTACTTC TGTGGTGTGA 



AAAGCTCTAA GGTAAATATA AAATTTTTAA 



TTGTTTGTGT ATTTTAGATT CCAACCTATG 



CCTTTAATGA GGAAAACCTG TTTTGCTCAG 



CTGCTGACTC TCAACATTCT ACTCCTCCAA 



ACTTTCCTTC AGAATTGCTA AGTTTTTTGA 



CTTGCTTTGC TATTTACACC ACAAAGGAAA 
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RAAATTATGG AAAAATATTC TGTAACCTTT ATAAGTAGGC 
AAGCTGCACT GCTATACAAG AAAATTATGtj a>wu, 

5460 

ATAACAGTTA TAA^AAC ATACTGTTTT TTC.TACCC ACACAGGCAT AGAGTGTCTG 
5520 

CTATTAATAA CTATGCTCAA AAATTGTGTA CCTTTAGCTT TTTAATTTGT AAAGGGGTTA 
5580 

ATAAGGAATA TTTGATGTAT AGTGCCTTGA CTAGAGATCA TAATCAGCCA TACCACATTT 
5640 

GTAGAGGTTT TACTTGCTTT AAAAAACCTC CCACACGTCC CCCTGAACCT GAAACA.AAA 
5700 

ATGAATGCAA TTGTTGTTGT TAACTTGTTT ATTGCAGCTT ATAATGGTTA CAAATAAAGC 
5760 

AATAGCATCA CAAATTTCAC AAATAAAGCA TTTTTTTCAC TGCATTCTAG TTGTGGTTTG 
5820 

TCCAAACTCA TCAATGTATC TTATCATGTC TGGATCCGGC TGTGGAATGT GTGTCAGTTA 
5880 

GGGTGTGGAA AGTCCCCAGG CTCCCCAGCA GGCAGAAGTA TGCAAAGCAT GCATCTCAAT 
5940 

TAGTCAGCAA CCAGGTGTGG AAAGTCCCCA GGCTCCCCAG CAGGCAGAAG TATGCAAAGC 
6000 

ATGCATCTCA ATTAGTCAGC AACCATAGTC CCGCCCGTAA CTCGGCCCAT CCCGCCGCTA 
6060 

ACTCCGCGCA GTTCCGCCCA TTCTCCGCCC CATGGCTGAC TAATTTTTTT TATTTATGCA 
6120 

GAGGCCGAGG CCGCCTCGGG CTCTGAGCTA TTCCAGAAGT AGTGAGGAGG CTTTTTTGGA 
6180 
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GGCCTAGGCT TTTGCAAAAA GCTTGGACAC 
6240 

ACCACTTTAT CCCGCGTCAG GGAGAGGCAG 
6300 

ACTGGTTTTT AGTGCGCCAG ATCTCTATAA 
6360 

TTTTTAAGCC GTAGATAAAC AGGCTGGGAC 
6420 

CTGGGAGATG TTGCAGATCC ATGCACGTAA 
6480 

ATGGAAAGGC ATTATTGCCG TAAGCCGTGG 
6540 

» 

TGAACTGGGT ATTCGTCATG TCGATACCGT 
6600 • 

GCGCGAGCTT AAAGTGCTGA AACGCGCAGA 
6660 

TGACCTGGTG GATACCGGTG GTACTGCGGT 
6720 

CTTTGTCACC ATCTTCGCAA AACCGGCTGG 
6780 

TATCCCGCAA GATACCTGGA TTGAACAGCC 
6840 

AATCTCCGGT CGCTAATCTT TTCAACGCCT 
6900 

CAGGCGGGTT ACAATAGTTT CCAGTAAGTA 
6960 



AAGACAGGCT TGCGAGATAT GTTTGAGAAT 



TGCGTAAAAA GACGCGGACT CATGTGAAAT 



TCTCGCGCAA CCTATTTTCC CCTCGAACAC 



ACTTCACATG AG C G AAAAAT ACATCGTCAC 



ACTCGCAAGC CGACTGATGC CTTCTGAACA 



CGGTCTGGTA CCGGGTGCGT TACTGGCGCG 



TTGTATTTCC AGCTACGATC ACGACAACCA 



AGGCGATGGC GAAGGCTTCA TCGTTATTGA 



TGCGATTCGT GAAATGTATC CAAAAGCGCA 



TCGTCCGCTG GTTGATGACT ATGTTGTTGA 



GTGGGATATG GGCGTCGTAT TCGTCCCGCC 



GGCACTGCCG GGCGTTGTTC TTTTTAACTT 



TTCTGGAGGC TGCATCCATG ACACAGGCAA 
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^ACCCCGCT TTAAACATCC TGAAACCTCG ACGCTAGTCC 
ACCTGAGCGA AACCCTGTTC AAACCCCGCT li*** 

7020 

GCCGCTTTAA TCACGGCGCA CAACCGCCTG TGCAGTGGGC CCTTGA TC GT AAAACCATGC 
7080 

CTCACTGGTA TCGCATGATT AACCGTCTGA TGTGGATCTG GCGCGGCATT GAGGGAGGGG 
7140 

ftAATCCTCGA CGTGCAGGCA CGTATTGTGA TGAGCGATGC GGAACG.AGG GACGATGATT 
7200 

„TAC GGTGATTGGC TACCGTGGCG GCAACTGGAT TTATGAGTGG GCCCCGGA^ 
7260 

TTTGTGAAGG AACCTTACIT CTGTGGTGTG ACATAATTGG AGAAACTAGC TACAGAGATT 
7320 

TAAAGCTCTA AGGTAAATAT AAAATTTTTA AGTGTATAAT GTGTTAAACT ACTGATTCTA 
7380 

ATTGTTTGTG TATTTTAGAT TCCAACCTAT GGAACTGATG AATGGGAGCA GTGGTGGAAT 
7440 

„ AGGAAAACCT GTTTTGCTCA GAAGAAATGC CATCTAGTGA TGATGAGGCT 
7500 

A „CT CTCAACATTC TACTCCTCCA AAAAAGAAGA GAAAGGTAGA AGACCCCAAG 
7560 

GACTTTCCTT CAGAATTGCT AAGTTTTTTG AGTCATGCTG TGTTTAGTAA TAGAACTCTT 
7620 

GCTTGCTTTG CTATTTACAC CACAAAGGAA AAAGCTGCAC ™CAA GAAAATTATG 
7680 

GAAAAATATT CTGTAACCTT TATAAGTAGG CATAACAGTT ATAATCATAA CATACTGTTT 
7740 
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TTTCTTACTC CACACAGGCA TAGAGTGTCT 
7800 

ACCTTTAGCT TTTTAATTTG TAAAGGGGTT 
7860 

ACTAGAGATC ATAATCAGCC ATACCACATT 
7920 • 

CCCACACCTC CCCCTGAACC TGAAACATAA 
7980 

TATTGCAGCT TATAATGGTT ACAAATAAAG 
8040 

ATTTTTTTCA CTGCATTCTA GTTGTGGTTT 
8100 

CTGGATCCCC AGGAAGCTCC TCTGTGTCCT 
8160 

ACATTCCAAT CATAGGCTGC CCATCCACCC 
8220 

ACAAAAAGGA AATTGGGTAG GGGTTTTTCA 
8280 

ATCTGGGAAG TCCCTTCCAC TGCTGTGTTC 
8340 

CAACAGCAGA AACATACAAG CTGTCAGCTT 
8400 

GAAGCACTGT GGTTGCTGTG TTAGTAATGT 
8460 

TGTAGGTTCC AAAATATCTA GTGTTTTCAT 
8520 



GCTATTAATA ACTATGCTCA AAAATTGTGT 



AATAAGGAAT ATTTGATGTA TAGTGCCTTG 



TGTAGAGGTT TTACTTGCTT TAAAAAACCT 



AATGAATGCA ATTGTTGTTG TTAACTTGTT 



CAATAGCATC ACAAATTTCA CAAATAAAGC 



GTCCAAACTC ATCAATGTAT CTTATCATGT 



CATAAACCCT AACCTCCTCT ACTTGAGAGG 



TCTGTGTCCT CCTGTTAATT AGGTCACTTA 



CAGACCGCTT TCTAAGGGTA ATTTTAAAAT 



CAGAAGTGTT GGTAAACAGC C C AC AAATGT 



TGCACAAGGG CCCAACACCC TGCTCATCAA 



GCAAAACAGG AGGCACATTT TCCCCACCTG 



TTTTACTTGG ATCAGGAACC CAGCACTCCA 
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CTG GATAAGC ATTATCCTTA TCCAAAACAG ««« AGTGTTCATC TCGTGACTGT 
8580 

CAACTGTAGC ATTTTTTGGG GTTACAGTTT GAGCAGGATA TTTGGTCCTG TAGTTTGCTA 
8640 

ACACACCCTG CAGCTCCAAA GGTTCCCCAC CAACAGCAAA AAAATGAAAA TTTGACCCTT 
8700 

GAATGGGTTT TCCAGCACCA TTTTCATGAG TTTTTTGTGT CCCTGAATGC AAGTTTAACA 
8760 

TAGCAGTTAC GCGAATAACC TCAGTTTTAA CAGTAACAGC TTCCCACATC AAAATATTTC 
8820 

CACAGGTTAA GTCCTCATTT AAATTAGGCA AAGGAATTCT TGAAGACGAA AGGGCCTCGT 
8880 

GATACGCCTA TTTTTATAGG TTAATGTCAT GATAATAATG GTTTCTTAGA CGTCAGGTGG 



8940 



CACTTTTCGG GGAAATGTGC GCGGAACGCC TATTTGTTTA TTTTTCTAAA TACATTGAAA 



9000 



TATGTATCCG CTCATGAGAC AATAACCCTG ATAAATGCTT CAATAATATT GAAAAAGGAA 



9060 



GAGTATGAGT ATTCAACATT TCCGTGTCGC GCTTATTCCG TTTTTTGCGG CATTTTGCCT 



9120 



.^rrrTGGT GAAAGTAAAA GATGCTGAAG ATCAGTTGGG 
TCCTGTTTTT GCTCACCCAG AAACGCTGGT bAAAbi^ 

9180 

TGCACGAGTG GGTTACATCG AACTGGATCT CAACAGCGGT AAGATCCTTG AGAGTTTTCG 



9240 



» ^.mrLrmr T^TTAAAGTT CTGCTATGTG GCGCGGTATT 
ICCCGAAGAA CGTTTTCCAA TGATGAGCAC TTTTAAALa 



9300 
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ATCCCGTGTT GACGCCGGGC AAGAGCAACT 
9360 

CTTGGTTGAG TACTCACCAG TCACAGAAAA 
9420 

ATTATGCAGT GCTGCCATAA CCATGAGTGA 
9480 

GATCGGAGGA CCGAAGGAGC TAACCGCTTT 
9540 

CCTTGATCGT TGGGAACCGG AGCTGAATGA 
9600 

GATGCCTGCA GCAATGGCAA CAACGTTGCG 
9660 

AGCTTCCCGG CAACAATTAA TAGACTGGAT 
9720 

GCGCTCGGCC CTTCCGGCTG GCTGGTTTAT 
9780 

GTCTCGCGGT ATCATTGCAG CACTGGGGCC 
9840 

CTACACGACG GGGAGTCAGG CAAGTATGGA 
9900 

TGCCTCACTG ATTAAGCATT GGTAACTGTC 
9960 

TGATTTAAAA CTTCATTTTT AATTTAAAAG 
10020 

CATGACCAAA ATCCCTTAAC GTGAGTTTTC 
10080 



CGGTCGCCGC ATACACTATT CTCAGAATGA 



GCATCTTACG GATGGCATGA CAGTAAGAGA 



TAACACTGCG GCCAACTTAC TTCTGACAAC 



TTTGCACAAC ATGGGGGATC ATGTAACTCG 



AGCCATACCA AACGACGAGC GTGACACCAC 



CAAACTATTA ACTGGCGAAC TACTTACTCT 



GGAGGCGGAT AAAGTTGCAG GACCACTTCT 



TGCTGATAAA TCTGGAGCCG GTGAGCGTGG 



AGATGGTAAG CCCTCCCGTA TCGTAGTTAT 



TGAACGAAAT AGACAGATCG CTGAGATAGG 



AGACCAAGTT TACTCATATA TACTTTAGAT 



GATCTAGGTG AAGATCCTTT TTGATAATCT 



GTTCCACTGA GCGTCAGACC CCGTAGAAAA 
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GATGAAAGGA TCTTCTTGAG ATCCTTTTTT TCTGCGCCTA ATCTGCTGCT 
10140 

^AACCAGCG CTACGAGCGG TGGTTTGTTT GCCGGATCAA GAGCTACCAA CTCTITTTCC 
10200 

GAAGGTAACT GGCTTCAGCA GAGCGCAGAT ACCAAATACT GTCCTTCTAG TGTAGCCGTA 
10260 

GTTAGGCCAC CACTTCAAGA ACTCTGTAGC ACCGCCTACA .ACCTCGCTC TGCTAATCCT 
10320 

GTTACCAGTG GCTGCTGCCA GTGGCGATAA GTCGTGTCTT ACCGGGTTGG ACTCAAGACG 
10380 

ATAGTTACCG GATAAGGCGC AGGGGTGGGG CTGAACGGGG GGTTCGTGCA CACAGCCCAG 
10440 

CTTGGAGCGA ACGACCTACA CCGAACTGAG ATACCTACAG CGTGAGCTAT GAGAAAGCGC 
10500 

CACGCTTCCC GAAGGGAGAA AGGCGGACAG GTATCCGGTA AGCGGCAGGG TCGGAACAGG 
10560 

AGAGCGCACG AGGGAGCTTC CAGGGGGAAA CGCCTGGTAT CTTTATAGTC CTGTCGGGTT 
10620 

TCGGCACCTC TGACTTGAGC GTCGATTTTT GTGATGCTCG TCAGGGGGGC GGAGCCTATG 
10680 

GAAAAACGCC AGCAACGCGG CCTTTTTACG GTTCCTGGCC TTTTGCTGGC CTTTTGCTCA 
10740 

CATGTTCTTT CCTGCGTTAT CCCCTGATTC TGTGGATAAC CGTATTACCG CCTTTGAGTG 
10800 

AGCTGATACC GCTCGCCGCA GCCGAACGAC CGAGCGCAGC GAGTCAGTGA GCGAGGAAGC 
10860 
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GGAAGAGCGC CTGATGCGGT ATTTTCTCCT TACGCATCTG TGCGGTATTT CACACCGCAT 
10920 

ATGGTGCACT CTCAGTACAA TCTGCTCTGA TGCCGCATAG TTAAGCCAGT ATACACTCCG 
10980 

CTATCGCTAC GTGACTGGGT CATGGCTGCG CCCCGACACC CGCCAACACC CGCTGACGCG 
11040 

CCCTGACGGG CTTGTCTGCT CCCGGCATCC GCTTACAGAC AAGCTGTGAC CGTCTCCGGG 
11100 

AGCTGCATGT GTCAGAGGTT TTCACCGTCA TCACCGAAAC GCGCGAGGCA GC 
11152 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GACGGATCGG GAGATCTCC 
19 

(2) INFORMATION FOR SEQ ID NO: 14: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{ D ) TOPOLOGY : 1 inear 

(i i) MOLECULE TYPE: DNA (genomic 
(iii) HYPOTHETICAL: NO 

(iv) ANT I- SENSE: NO 



(xi ) SEQUENCE DESCRIPTION: SEQ ID NO:14 

CCGCCTCAGA AGCCATAGAG CC 
22 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14455 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Mcmooe aoaaatostt gaactcccoa gactotccta cacctagsgg agaagcagcc 

60 

Mrrw rr^CGACCCG TCTGCGCACA AACGGATGAG CCCATCAGAC 
AAGGGGTTGT TTCCCACCAA GGACGACCCb 

120 
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AAAGACATAT TCATTCTCTG 
180 

GGGGAAGTTG CGGTTCGTGC 
240 

GTGCAAGATT ACAATCTAAA 
300 

CAGCCAACTT CCTCTTACAA 
3 60 . 

CTTGCTAAAA ATTATATTTT 
420 

ATGTTAAGAA ATGAATCATT 
480 

ATGGGAATAG AAAATAGAAA 
540 

TTGACCACAG GCCTAGAAGT 
600 

AGGTGGTGGC AACCAGGGAC 
660 

TTACCATATA CAGGAAGATA 
720 

AAAGTGTTAT ATAGATCCCT 
780 

TGTATGTTGT CTCAAGAAGA 
840 

CTAGGAACAG GAATGCACTT 
900 



CTGCAAACTT GGCATAGCTC 

TCGCAGGGCT CTCACCCTTG 

CAATTCGGAG AACTCGACCT 

GCCGCATCGA TTTTGTCCTT 

TACCAATAAG ACCAATCCAA 

ATCTTTTAGT ACTATTTTTA 
GAGACGCTCA ACCTCAATTG 

AAAAAAGGGA AAAAAGAGTG 

TTATAGGGGA CCTTACATCT 

TGACTTAAAT TGGGATAGGT 

CCCTTTTCGT GAAAGACTCG 

AAAAGACGAC ATGAAACAAC 

TTGGGGAAAG ATTTTCCATA 



TGCTTTGCCT GGGGCTATTG 

ACTCTTTTAA TAGCTCTTCT 

TCCTCCTGAG GCAAGGACCA 

CAGAAATAGA AATAAGAATG 

TAGGTAGATT ATTAGTTACT 

CTCAAATTCA GAAGTTAGAA 
AAGAACAGGT GCAAGGACTA 

TTTTTGTCAA AATAGGAGAC 

ACAGACCAAC AGATGCCCCC 

GGGTTACAGT CAATGGCTAT 

CCAGAGCTAG ACCTCCTTGG 

AGGTACATGA TTATATTTAT 

CCAAGGAGGG GACAGTGGCT 
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CGi « AACATTATTC XGGAAAAAG, — ^™ 
960 

TIGC CGCAAG B*» T ~ CTGTTCTTAA 
1020 

„A,G TGAGACAAGT GG TTI GC T GA — * 
1080 

TTGGAATTTA TCCAAATCTT ATGTAAATGC 



GCTCTGAGTG TTCTATTTTC CTATGTTCTT 
1140 



TTATGTAAAC =AA« W AA AAGAGTGCTG ATTTTTTGAG TAAACTTGCA ACA13TCCTAA 

Lo«« TTGTGTGTTT GTGTCTGTTC GCCATCCCGT CTCCGCTCGT CACTTATCCT 
1260 

TCACTTTCCA GAGGGTCCCC GCGCAGACGC CGGCGAGCCT CAGGTCGGGC GACTGCGGCA 
1320 

GCTGGCGCCC GAACAGGGAC CCTCGGATAA GTGACCCTTG TCTCTATTTC TACTATTTGG 
1380 

TGT"TTGTCTT GTATTGTCTC TTTCTTGTCT GGCTATCATC ACAAGAGCGG AACGGACTCA 
1440 

CCATAGGGAC CAAGGTAGCG ACTGAAAATg' AGAGATATTA TGTGGGAGGG AGGTGTTAT, 
1500 

ACCGAAGAAA TGGCCGCCAG TCTTTTGGAC CAGCTGATCG AAGAGGTACT GGCTGATAAT 
1560 

CTTCCACCTC CTAGCCATTT TGAAGGACCT ACCCTTCACG AACTGTATGA TTTAGACGTG 
1620 

ACGGCCCCCG AAGATGCCAA CGAGGAGGCG GTTTCGCAGA TTTTTCCCGA CTCTGTAATG 
1680 
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TTGGCGGTGC AGGAAGGGAT 
1740 

CCGCCTCACC TTTCCCGGCA 
1800 

ATGCCAAACC TTGTACCGGA 
1860 

AGTGACGACG AGGATGAAGA 
1920 

CACGGTTGCA GGTCTTGTCA 
1980 

TCGCTTTGCT ATATGAGGAC 
2040 

GTGGGTGATA GAGTGGTGGG 
2100 

GGTTTAAAGA ATTTTGTATT 
2160 

AGCCCGAGCC AGAACCGGAG 
2220 

TCCTGAGACG CCCGACATCA 
2280 

ACTCCGGTCC TTCTAACACA 
2340 

AACCAGTTGC CGTGAGAGTT 
2400 

TTAACGAGCC TGGGCAACCT 
2460 



- 115- 

TGACTTACTC ACTTTTCCGC 
GCCCGAGCAG CCGGAGCAGA 
GGTGATCGAT CTTACCTGCC 
GGGTGAGGAG TTTGTGTTAG 
TTATCACCGG AGGAATACGG 
CTGTGGCATG TTTGTCTACA 
TTTGGTGTGG TAATTTTTTT 
GTGATTTTTT TAAAAGGTCC 
CCTGCAAGAC CTACCCGCCG 
CCTGTGTCTA GAGAATGCAA 
CCTCCTGAGA TACACCCGGT 
GGTGGGCGTC GCCAGGCTGT 
TTGGACTTGA GCTGTAAACG 
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CGGCGCCCGG TTCTCCGGAG 



GAGCCTTGGG TCCGGTTTCT 



ACGAGGCTGG CTTTCCACCC 



ATTATGTGGA GCACCCCGGG 



GGGACCCAGA TATTATGTGT 



GTAAGTGAAA ATTATGGGCA 



TTTAATTTTT ACAGTTTTGT ■ 



TGTGTCTGAA CCTGAGCCTG 



TCCTAAAATG GCGCCTGCTA 



TAGTAGTACG GATAGCTGTG 



GGTCCCGCTG TGCCCCATTA 



GGAATGTATC GAGGACTTGC 



CCCCAGGCCA TAAGGTGTAA 
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, TTtaCGCCTT TGTTTGCTGA ATGAGTTGAT GTAAGTTTAA 
ACCTGTGATT GCGTGTGTGG TTAACGCCTT TGT1 l 

2520 

TAAAGGGTGA GATAATGTTT AACTTGCA'TCa =C=™ TGGGGCGGGG CTTAAAGGGT 
2580 

„GCG CCGTGGGCA ™™ CATGTGACC, CATGGAGGCT TGGGAGTGTT 
2640 

„TX TTCTGCTGTC CGTAACTTGC TGGAACAGAG CTCTAACAGT ACC TOT GG T 
2700 

n^TGGGC TCATCCCAGG CAAAGTTAGT CTGCAGAATT AAGGAGGATT 
TTTGGAGGTT TCTGTGGGGC TCATCUUA^ 

2760 

ACAAGTGGGA ATTTGAAGAG CTTTTGAAAT CCTGTGGTGA GCTGTTTGAT TCTTTGAATC 
2820 

XSG GTCACCA GGCGCTTTTC GAAGAGAAGG TCATCAAGAC TTTGGATTTT TCCACACCGG 
2880 

OOCGCGCTGC GGCTGCTGTT GCTTTTTTGA GTTTTATAAA GGATAAATGG AGCGAAGAAA 
2940 

CCCATCTGAG CGGGGGGTAC CTGCTGGATT TTCTGGCCAT GCATCTGTGG AGAGCGGTTG 
3000 

TGAGACACAA GAATCGCCTG CTACTGTTGT CTTCCGTCCG CCCGGCGATA ATAGCGACGG 
3060 

AGGAGCAGCA GCAGCAGCAG GAGGAAGCCA GGCGGCGGCG GCAGGAGCAG AGCCCATGGA 
3120 

ACCCGAGAGC CGGCCTGGAC CCTCGGGAAT GAATGT"TGTA CAGGTGGCTG AACTGTATCC 
3180 

AGAACTGAGA CGCATTTTGA CAATTAGAGA GGATGGGCAG GGGCTAAAGG GGGTAAAGAG 
3240 
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GGAGCGGGGG GCTTGTGAGG CTACAGAGGA GGCTAGGAAT CTAGCTTTTA GCTTAATGAC 
3300 

CAGACACCGT CCTGAGTGTA TTACTTTTCA ACAGATCAAG GATAATTGCG CTAATGAGCT 
3360 

TGATCTGCTG GCGCAGAAGT ATTCCATAGA GCAGCTGACC ACTTACTGGC TGCAGCCAGG 
3420 

GGATGATTTT GAGGAGGCTA TTAGGGTATA TGCAAAGGTG GCACTTAGGC CAGATTGCAA 
3480 

GTACAAGATC AGCAAACTTG TAAATATCAG GAATTGTTGC TACATTTCTG GGAACGGGGC 
3540 

CGAGGTGGAG ATAGATACGG AGGATAGGGT GGCCTTTAGA TGTAGCATGA TAAATATGTG 
3600 

GCCGGGGGTG CTTGGCATGG ACGGGGTGGT TATTATGAAT GTAAGGTTTA CTGGCCCCAA 
3 660 

TTTTAGCGGT ACGGTTTTCC TGGCCAATAC CAACCTTATC CTACACGGTG TAAGCTTCTA 
3720 

TGGGTTTAAC AATACCTGTG TGGAAGCCTG GACCGATGTA AGGGTTCGGG GCTGTGCCTT 
3780 

TTACTGCTGC TGGAAGGGGG TGGTGTGTCG CCCCAAAAGC AGGGCTTCAA TTAAGAAATG 
3840 

CCTCTTTGAA AGGTGTACCT TGGGTATCCT GTCTGAGGGT AACTCCAGGG TGCGCCACAA 

3900 

TGTGGCCTCC GACTGTGGTT GCTTCATGCT AGTGAAAAGC GTGGCTGTGA TTAAGCATAA 
3960 

CATGGTATGT GGCAACTGCG AGGACAGGGC CTCTCAGATG CTGACCTGCT CGGACGGCAA 
4020 



i 
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CTGTCACCTG CTGAAGACCA TTCACGTAGC CAGCCACTCT CGCAAGGCCT GGCCAGTGTT 
4080 

TGAGCATAAC ATACTGACCC GCTGTTCCTT GCATTTGGGT AACAGGAGGG GGGTGTTCCT 
4140 

ACCTTACCAA TGCAATTTGA GTCACACTAA GATATTGCTT GAGCCCGAGA GCATGTCCAA 
4200 

GGTGAACCTG AACGGGGTGT TTGACATGAC CATGAAGATC TGGAAGGTGC TGAGGTACGA 
4260 

TGAGACCCGC ACCAGGTGCA GACCCTGCGA GTGTGGCGGT AAACATATTA GGAACCAGCC 
4320 

TGTGATGCTG GATGTGACCG AGGAGCTGAG GCCCGATCAC TTGGTGCTGG CCTGCACCGG 
4380 

CGCTGAGTTT GGCTCTAGCG ATGAAGATAC AGATTGAGGT ACTGAAATGT GTGGGGGTGG 
4440 

CTTAAGGGTG GGAAAGAATA TATAAGGTGG GGGTCTTATG TAGTTTTGTA TCTGTTTTGC 
4500 

AGCAGCCGCC GCGGCGATGA GGAGCAACTC GTTTGATGGA AGCATTGTGA GCTCATATTT 
4560 

GACAACGCGC ATGCCCCCAT GGGCCGGGGT GCGTCAGAAT GTGATGGGCT CCAGCATTGA 
4620 

T GGTC=CGCC GTCCTGCCCG CAAACTCTAC TACCTTGACC TACGAGACCG TGTCTGGAAC 
4680 

GCCGTTGGAG ACTGCAGCGT CGGGCGGCGC TTCAGCCGCT GCAGGCACGG CCCGCGGGAT 
4740 

TGTGACTGAC TTTGCTTTCC TGAGCCCGCT TGCAAGCAGT GCAGCTTCCC GTTCATCCGC 
4800 
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CCGCGATGAC AAGTTGACGG CTCTTTTGGC 
4860 

TGTCGTTTCT CAGCAGCTGT TGGATCTGCG 
4920 

CCCTCCCAAT GCGGTTTAAA ACATAAATAA 
4980 

GCAAGTGTCT TGCTGTCTCT CGAGGGATCT 
5040 

CATAATTGGA CAAACTACCT ACAGAGATTT 
5100 

GTGTATAATG TGTTAAACTA CTGATTCTAA 
5160 

GAACTGATGA ATGGGAGCAG TGGTGGAATG 
5220 

AAGAAATGCC ATCTAGTGAT GATGAGGCTA 
5280 

AAAAGAAGAG AAAGGTAGAA GACCCCAAGG 
5340 • 

4 * 

GTCATGCTGT GTTTAGTAAT AGAACTCTTG 
5400 

AAGCTGCACT GCTATACAAG AAAATTATGG 
5460 

ATAACAGTTA TAATCATAAC ATACTGTTTT 
5520 

CTATTAATAA CTATGCTCAA AAATTGTGTA 
5580 



ACAATTGGAT TCTTTGACCC GGG AACTTAA 



CCAGCAGGTT TCTGCCCTGA AGGCTTCCTC 



AAAACCAGAC TCTGTTTGGA TTTGGATCAA 



TTGTGAAGGA ACCTTACTTC TGTGGTGTGA 



AAAGCTCTAA GGTAAATATA AAATTTTTAA 



TTGTTTGTGT ATTTTAGATT CCAACCTATG 



CCTTTAATGA GGAAAACCTG TTTTGCTCAG - 



CTGCTGACTC TCAACATTCT ACTCCTCCAA 



ACTTTCCTTC AGAATTGCTA AGTTTTTTGA 



CTTGCTTTGC TATTTACACC ACAAAGGAAA 



AAAAATATTC TGTAACCTTT ATAAGTAGGC 



TTCTTACTCC ACACAGGCAT AGAGTGTCTG 



CCTTTAGCTT TTTAATTTGT AAAGGGGTTA 
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^rrCTTGA CTAGAGATCA TAATCAGCCA TACCACATTT 
ATAAGGAATA TTTGATGTAT AGTGCCTTGA CTAG 

5640 

— c OT — gcacacccc cccxgaacc, gaaaca^aa 

5700 

, TAA.CTTGTTT ATTGCAGCTT ATAATGGTTA CAAATAAAGC 
ATGAATGCAA TTGTTGTTGT TAACTTGTTT AT 

5760 

s , irpl TTTTTTTCAC TGCATTCTAG TTGTGGTTTG 
AATAGCATCA CAAATTTCAC AAATAAAGCA TTTTTTTCAC 

5820 

TCC AAAC T GA »*C««C »«C«C TGTGGAATGT G^A 

—00 «CCCCAGCA GGCAGAAGTA TGCAAAGCAT GCATCTCAAT 

5940 

GGCTCCCCAG CAGGCAGAAG TATGCAAAGC 



TAGTCAGCAA CCAGGTGTGG AAAGTCCCCA 
6000 

ATGCATCTCA ATTAGTCAGC AACCATAGTC CCGCGCC.AA CXCCGCCGA, GCCGCCCCA 

ACTCCGCCCA G TT CCGCCCA TTCTCCGCCC CATGGCTGAC TAATTTTTTT TATTTATGCA 
6120 

GAGGCCGAGG CCGCCTCGGC CTCTGAGCTA TTCCAGAAGT AGTGAGGAGG CTTTTTTGGA 
6180 

**** rCTTGGACAC AAGACAGGCT TGCGAGATAT GTTTGAGAAT 
GGCCTAGGCT TTTGCAAAAA GCTTGGACAC ^ 

6240 

.CCACTTTAT CCCGGG.CAG GGAGAGGCAG TGCGTAAAAA GACGCGGACT CATGTGAAAT 
6300 

rr.r ATCTCTATAA TCTCGCGCAA CCTATTTTCC CCTCGAACAC 
ACTGGTTTTT AGTGCGCCAG ATCTCTATAA 

6360 
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TTTTTAAGCC GTAGATAAAC 
6420 

CTGGGACATG TTGCAGATCC 
6480 

ATGGAAAGGC ATTATTGCCG 
6540 

TGAACTGGGT ATTCGTCATG 
6600 

GCGCGAGCTT AAAGTGCTGA 
6660 

TGACCTGGTG GATACCGGTG 
6720 

CTTTGTCACC ATCTTCGCAA 
6780 

TATCCCGCAA GATACCTGGA 
6840 

AATCTCCGGT CGCTAATCTT 
6900 

CAGGCGGGTT ACAATAGTTT 
6960 

ACCTGAGCGA AACCCTGTTC 
7020 

GCCGCTTTAA TCACGGCGCA 
7080 

CTCACTGGTA TCGCATGATT 
7140 



- 121 - 

AGGCTGGGAC ACTTCACATG 
ATGCACGTAA ACTCGCAAGC 
TAAGCCGTGG CGGTCTGGTA 
TCGATACCGT TTGTATTTCC 
AACGCGCAGA AGGCGATGGC 
GTACTGCGGT TGCGATTCGT 
AACCGGCTGG TCGTCCGCTG 
TTGAACAGCC GTGGGATATG 
TTCAACGCCT GGCACTGCCG 
CCAGTAAGTA TTCTGGAGGC 
AAACCCCGCT TTAAACATCC 
CAACCGCCTG TGCAGTCGGC 
AACCGTCTGA TGTGGATCTG 
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AGCGAAAAAT ACATCGTCAC 



CGACTGATGC CTTCTGAACA 



CCGGGTGCGT TACTGGCGCG 



AGCTACGATC ACGACAACCA 



GAAGGCTTCA TCGTTATTGA 



GAAATGTATC CAAAAGCGCA 



GTTGATGACT ATGTTGTTGA 



GGCGTCGTAT TCGTCCCGCC 



GGCGTTGTTC TTTTTAACTT 



TGCATCCATG ACACAGGCAA 



TGAAACCTCG ACGCTAGTCC 



CCTTGATGGT AAAACCATCC 



GCGCGGCATT GACCCACGCG 
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„a cgtccaggca .cgtattctga T gagcga,gc ggaaggtagg «««t 

7200 

„ag gg^ggc — ™ K 

AACCTTACTT ACATAATTGG ACAAACTACC XACAOAOA. 

7320 

m ^ .rTGTATAAT GTGTTAAACT ACTGATTCTA 
TAAAGCTCTA AGGTAAATAT AAAATTTTTA AGTGTATAAT 

7380 

_ T TCCAAC CTAT GGAACTGATG AATGGGAGCA GTGGTGGAAT 
ATTGTTTGTG TATTTTAGAT TCCAACCTAT w. 

7440 

APGAAAACCT GTTTTGCTCA GAAGAAATGC CATCTAGTGA TGATGAGGCT 
GCCTTTAATG AGGAAAACCT bliiAyu 

7500 

ACTGCTGACT GTCAACA^C TACTCCTCCA AAAAAGAAGA GAAAGGTAGA AGACCCCAAG 
7560 

.^r^rrTr TGTTTAGTAA TAGAACTCTT 
GACTTTCCTT CAGAATTGCT AAGTTTTTTG AGTCATGCTG TGTTTA 

7620 

^^zvGCTGCAC TGCTATACAA GAAAATTATG 
GCTTGCTTTG CTATTTACAC CACAAAGGAA AAAGCTGCAC 

7680 

GAAAAATATT CXGTAAC^ TATAAGTAGG CATAACAGTT ATAATCATAA CATACTOTTT 

ILaCTG GACACAGGCA TAGAGTGTCT GGTA^ATA ACTA.GCTCA AAAAT*^ 
7800 

^ -r^rGGGTT AATAAGGAAT ATTTGATGTA TAGTGCCTTG 
ACCTTTAGCT TTTTAATTTG TAAAGGGGTT AATAAO 

7860 

nrr *TACCACA™ TGTAGAGGTT TTACTTGCTT TAAAAAACCT 
ACTAGAGATC ATAATCAGCC AT AC C AC A- . 

7920 
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CCCACACCTC CCCCTGAACC TGAAACATAA AATGAATGCA ATTGTTGTTG TTAACTTGTT 
7980 

TATTGCAGCT TATAATGGTT ACAAATAAAG CAATAGCATC ACAAATTTCA CAAATAAAGC 

■ 

8040 

ATTTTTTTCA CTGCATTCTA GTTGTGGTTT GTCCAAACTC ATCAATGTAT CTTATCATGT 
8100 

CTGGATCCCC AGGAAGCTCC TCTGTGTCCT CATAAACCCT AACCTCCTCT ACTTGAGAGG 
8160 

ACATTCCAAT CATAGGCTGC CCATCCACCC TCTGTGTCCT CCTGTTAATT AGGTCACTTA 
8220 

ACAAAAAGGA AATTGGGTAG GGGTTTTTCA CAGACCGCTT TCTAAGGGTA ATTTTAAAAT 
8280 

ATCTGGGAAG TCCCTTCCAC TGCTGTGTTC CAGAAGTGTT GGTAAACAGC CCACAAATGT 
8340 

CAACAGCAGA AACATACAAG CTGTCAGCTT TGCACAAGGG CCCAACACCC TGCTCATCAA 
8400 

GAAGCACTGT GGTTGCTGTG TTAGTAATGT GC AAAACAGG AGGCACATTT TCCCCACCTG 
8460 

TGTAGGTTCC AAAATATCTA GTGTTTTCAT TTTTACTTGG ATCAGGAACC CAGCACTCCA . 
8520 

CTGGATAAGC ATTATCCTTA TCCAAAACAG CCTTGTGGTC AGTGTTCATC TGCTGACTGT 
8580 

CAACTGTAGC ATTTTTTGGG GTTACAGTTT GAGCAGGATA TTTGGTCCTG TAGTTTGCTA 
8640 

ACACACCCTG' CAGCTCCAAA GGTTCCCCAC CAACAGCAAA AAAATGAAAA TTTGACCCTT 
8700 



t 
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rirr , TTTTCATGAG TTTTTTGTGT CCCTGAATGC AAGTTTAACA 
GAATGGGTTT TCCAGCACCA TTTTCATGAb 

8760 

n1> , w> , r , rr TTrrCACATC AAAATATTTC 
^.-..pp mpLrz^TTTAA CAGTAACAGC mu^niw 
TAGCAGTTAC CCCAATAACC TCAG.lii^ 

8820 

AfcrGAATTCT TGAAGACGAA AGGGCCTCGT 
CACAGGTTAA GTCCTCATTT AAATTAGGCA AAGGAATTCT 

8880 

^TAfiG TTAATGTCAT GATAATAATG GTTTCTTAGA CGTCAGGTGG 
GATACGCCTA TTTTTATAGG TTAAlbiwi* 

8940 

rnpra , rrrc TATTTGTTTA TTTTTCTAAA TACATTCAAA 
CACTTTTCGG GGAAATGTGC GCGGAACCCC TATTTCji I 

9000 

„ ^rr^rrrTG ATAAATGCTT CAATAATATT GAAAAAGGAA 
TATGTATCCG CTCATGAGAC AATAACCCTG ATAAAiuui 

9060 

GAGTATGAGT ATTCAACATT TCCGTGTCGC CCTTATTCCC TTTTTTGCGG CATTTTGCCT 
9120 

TCCTGTTTTT GGTCACCCAG AAACGCTGGT GAAAGTAAAA GATGCTGAAG ATCAGTTGGG 
9180 

„GTG GGTTACATCG AACTGGATCT caacagcggt aagatccttg agagt^tcg 

9240 

„ r ,„. rrl( , ttTTAAAGTT CTGCTATGTG GCGCGGTATT 
CCCCGAAGAA CGTTTTCCAA TGATGAGCAC TTTTAAAu 

9300 

ATC ccgtgtt gacgccgggc aagagcaact cggtcgccgc atacactat, ctcagaatga 

9360 

rr * TrT TACG GATGGCATGA CAGTAAGAGA 
CTTGGTTGAG TACTCACCAG TCACAGAAAA GCATCTTACG GA 

9420 

MT atgcagt gctgccataa ccatgagtga taacactgcg gccaacttac ttctgacaac 

9480 
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GATCGGAGGA CCGAAGGAGC TAACCGCTTT 
9540 

CCTTGATCGT TGGGAACCGG AGCTGAATGA 
9600 

GATGCCTGCA GCAATGGCAA CAACGTTGCG 
9660 

AGCTTCCCGG CAACAATTAA TAGACTGGAT 
9720 

GCGCTCGGCC CTTCCGGCTG GCTGGTTTAT 
9780 

GTCTCGCGGT ATCATTGCAG CACTGGGGCC 
9840 

CTACACGACG GGGAGTCAGG CAACTATGGA 
9900 

TGCCTCACTG ATTAAGCATT GGTAACTGTC 
9960 ■ 

TGATTTAAAA CTTCATTTTT AATTTAAAAG 
10020 

CATGACCAAA ATCCCTTAAC GTGAGTTTTC 
10080 

' GATCAAAGGA . TCTTCTTGAG ATCCTTTTTT 
10140 

AAAACCACCG CTACCAGCGG TGGTTTGTTT 
10200 

GAAGGTAACT GGGTTCAGCA GAGCGCAGAT 
10260 



TTTGCACAAC ' ATGGGGGATC ATGTAACTCG 



AGCCATACCA AACGACGAGC GTGACACCAC 



CAAACTATTA ACTGGCGAAC TACTTACTCT 



GGAGGCGGAT AAAGTTGCAG GACCACTTCT 



TGCTGATAAA TCTGGAGCCG GTGAGCGTGG 



AGATGGTAAG CCCTCCCGTA TCGTAGTTAT 



TGAACGAAAT AGACAGATCG CTGAGATAGG 



AGACCAAGTT TACTCATATA TACTTTAGAT 



GATCTAGGTG AAGATCCTTT TTGATAATCT 



GTTCCACTGA GCGTCAGACC CCGTAGAAAA 



TCTGCGCGTA ATCTGCTGCT TGCAAACAAA 



GCCGGATCAA GAGCTACCAA CTCTTTTTCC 



ACCAAATACT GTCCTTCTAG TGTAGCCGTA 
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irrcrCTACh TACCTCGCTC TGCTAATCCT 
GTTAGGCCAC CACTTCAAGA ACTCTGTAGC ACCGCCTACA TA 

10320 

GTTACCAGTG GCTGCTGCCA GTGGCGATAA GTCGTGTCTT ACCGGGTTGG ACTCAAGACG 
10380 

ATAGTTACCG GATAAGGCGC AGCGGTGGGG CTGAACGGGG GGTTCGTGCA CACAGCCCAG 
10440 

CTTGGAGCGA AGGAGGTAGA CCGAACTGAG ATAGGTAGAG CGTGAGCTAT GAGAAAGCGC 
10500 

CACGGTTGGG GAAGGGAGAA AGGGGGAGAG GTATCCGGTA AGCGGCAGGG TCGGAACAGG 
10560 

rrrrTGGTAT CTTTATAGTC CTGTCGGGTT 
AGAGCGCACG AGGGAGCTTC CAGGGGGAAA CGCCTGGTAT CT1 

10620 

.GGGCAGG.G TGACTTGAGC GTCGATTTTT GTGATGCTCG TCAGGGGGGC GGAGCCTATG 
10680 

GAAAAACGCC AGGAACGCGG CCTTTTTACG GTTCCTGGCC TTTTGCTGGC GT^GCTCA 
10740 

CATGTTCTTT CCTGCGTTAT CCGCTGATTG TGTGGATAAC GGTATTAGGG CCTTTGAGTG 
10800 

AGCTGATACC GGTGGCGGGA GGGGAACGAC CGAGCGCAGC GAGTCAGTGA GCGAGGAAGC 
10860 

GGAAGAGCGC CTGATGCGGT ATTTTCTCCT TACGCATCTG TGCGGTATTT CACACCGCAT 
10920 

AGCGGGTCAG AAGCCATAGA GGGGAGGGCA TCCCCAGCAT GCCTGCTATT GTCTTCCCAA 
10980 

„ r rr rCACCCCA CCCCCCAGAA TAGAATGACA CCTACTCAGA 
TCCTCCCCCT TGCTGTCCTG CCCCACCCLA ^llu 

11040 
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CAATGCGATG CAATTTCCTC ATTTTATTAG GAAAGGACAG TGGGAGTGGC ACCTTCCAGG 
11100 

GTCAAGGAAG GGACGGGGGA GGGGCAAACA ACAGATGGCT GGCAACTAGA AGGCACAGTC 
11160 

GAGGCTGATC AGCGAGCTCT AGCATTTAGG TGACACTATA GAATAGGGCC CTCTAGATGC 
11220 

ATGCTCGAGC GGCCGCTTCT TTATTCTTGG GCAATGTATG AAAAAGT GT A AGAGGATGTG 
11280 

GCAAATATTT CATTAATGTA GTTGTGGCCA GACCAGTCCC AT G AAAATGA CATAGAGTAT 
11340 

GCACTTGGAG TTGTGTCTCC TGTTTCCTGT GTACCGTTTA GTGTAATGGT TAGTGTTACA 
11400 

GGTTTAGTTT TGTCTCCGTT TAAGTAAACT TGACTGACAA TGTTACTTTT GGCAGTTTTA 
11460 

CCGTGAGATT TTGGATAAGC TGATAGGTTA GGCATAAATC CAACAGCGTT TGTATAGGCT 
11520 

GTGCCTTCAG TAAGATCTCC ATTTCTAAAG TTCCAATATT CTGGGTCCAG GAAGGAATTG 
11580 

TTTAGTAGCA CTCCATTTTC GTCAAATCTT ATAATAAGAT GAGCACTTTG AACTGTTCCA 
11640 

GATATTGGAG CCAAACTGCC TTTAACAGCC AAAACTGAAA CTGTAGCAAG TATTTGACTG 
11700 

CCACATTTTG TTAAGACCAA AGTGAGTTTA GCATCTTTCT CTGCATTTAG TCTACAGTTA 
11760 

> 

GGAGATGGAG CTGGTGTGGT CCACAAAGTT AGCTTATCAT TATTTTTGTT TCCTACTGTA 
11820 
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rT-rrCTAGTT TAGGAACCAT AGCCTTGTTT 
ATGGCACCTG TGCTGTCAAA ACTAAGGCCA GTTCCTAGTT 

11880 

, mmn ,. P , rrr gaT^TGTGTT TGGTGCATTA 
^, r „. mr rrr^ATTTTT GTTTTGAGGG LrAi - i^x^x 
GAATCAAATT CTAGGCCATG GCCAATIIii 

11940 

GGTGAACCAA ATTCAAGGGG ATCTGCTGCA TTAATGGCA TGGCTGTAGC GTCAAACATC 
12000 

~~mn* arrTTTTTGG AATTGTTTGA AGCTGTAAAC 
AACCCCTTGG CAGTGCTTAG GTTAACCTCA AGCTTTTTGG AA 

12060 

AAGTAAAGGC CTTTGTTGTA GTTAATATCC AAGTTGTGGG CTGAGTTTAT AAAAAGAGGG 
12120 

CCCTGTCCTA GTCrTAGATT TAGTTGGTTT TGAGCATCAA ACGGATAACT AACATCAAGT 
12180 

r fTiriLTPfTT AGTCCTCCTG CTACATTAAG TTGCATATTG 
ATAAGGCGTC TGTTTTGAGA ATCAATCCTT AGTCCTUU 

12240 

CCTTGTGAAT CAAAACCCAA GGCTCCAGTA ACTTTAGTTT GCAAGGAAGT ATTATTAATA 
12300 

GTCAGACCTG GACCAGTTGC TACGGTCAAA GTGTTTAGGT CGTCTGTTAC ATGCAAAGGA 
12360 

GCCCCGTACT TTAGTCCTAG TTTTCCATTT TGTGTATAAA TGGGCTCTTT CAAGTCAATG 
12420 

CCCAAGCTAC CAGTGGCAGT AGTTAGAGGG GGTGAGGCAG TGATAGTAAG GGTACTGCTA 
12480 

TCGGTGGTGG TGAGGGGGCC TGATGTTTGC AGGGCTAGCT TTCCTTCTCA CACTGTGAGG 
12540 

GGTCCTTGGG TGGCAATGCT AAGTTTGGAG TGGTGCACGG TTAGCGGGGC CTGTGATTGC 
12600 
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ATGGTGAGTG TGTTGCCCGC 
12660 

GAGGTAACTG TGAGGGGTGC 
12720 

GGTGGGCTCA CAGTGGTTAC 
12780 

r 

CCGTTGCCCA TTTTGAGCGC 
12840 

AAAGAGAGTA CCCCAGGGGG 
12900 

AGAAAAGGCA CAGTTGGAGG 
12960 

TCTTCAGACG GTCTTGCGCG 
13020 

CATCCGCCGT CTCAAGACCG 
13080 

CAACCCCGAC CGCCACCCGC 
13140 

CTGGTTAGAC GCCTTTCTCG 
13200 

CCTCGGTGGC GGAGTACCGT 
13260 

ACCGCGAAGA GTTTGTCCTC 
13320 

TCGGTACCAA GCTTGGGTCT 
13380 



-129- 

GACCATTAGA GGTGCGGCGG 
AGATATTTCC AGGTTTATGT 
ATTTTGGGAG GTAAGGTTGC 
AAGCATGCCA TTGGAGGTAA 
ACTCTCTTGA AACCCATTGG 
ACCGGTTTCC GTGTCATATG 
CTTCATCTTG GATCTCAAGC 
CCTACTTTAA TTACATCATC 
TGCCGCCCGC CACGGTGCTC 
AGAGGTTTTC CGATCCGGTC 
TCGGAGGCCG ACGGGTTTCC 
AACCGCGAGC CCAACAGCGA 
CCCTATAGTG AGTCGTATTA 
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CAGCCACAGT TAGGGCTTCT 



TTGACTTGGT TTTTTTGAGA 



CGGCCTCGTC CAGAGAGAGG 



CTAGAGGTTC GGATAGGCGC 



GGGATACAAA GGGAGGAGTA 



GATACACGGG GTTGAAGGTA 



CTGCCACACC TCACCTCGAC 



AGCAGCACCT CCGCCAGAAA 



AGCCTACCTT GCGACTGTGA 



GATGCGGACT CGCTCAGGTC 



GATCCAAGAG TACTGGAAAG 



GCTCGAATTC AGATCCGAGC 



ATTTCGATAA GCCAGTAAGC 
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agtgggttct nne agagagctct gcttatatag acgtcgcacc gtacacggct 

13440 

ACCGCCCATT TGCGTCAATG GGGCGGAGTT GTTACGACAT TTTGGAAAGT CCCGTTGATT 
13500 

TTGGTGCCAA AACAAACTCC CATTGACGTC AATGGGGTGG AGACTTGGAA ATCCCCGTGA 
13560 

GTCAAACCGC TATCCACGCC CATTGATGTA CTGCCAAAAC CGCATCACCA TGGTAATAGC 
13620 

GATGACTAAT ACGTAGATGT ACTGCCAAGT AGGAAAGTCC CATAAGGTCA TGTACTGGGC 
13680 

ATAATGCCAG GCGGGCCATT TACCGTCATT GACGTCAATA GGGGGCGTAC TTGGCATATG 
13740 

ATACACTTGA TGTACTGCCA AGTGGGCAGT TTACCGTAAA TAGTCCACCC ATTGACGTCA 
13800 

ATGGAAAGTG CCTATTGGCG TTACTATGGG AACATACGTC ATTATTGACG TCAATGGGCG 
13860 

GGGGTCGTTG GGCGGTCAGC CAGGCGGGCC ATTTACCGTA AGTTATGTAA CGCGGAACTC 
13920 

CATATATGGG CTATGAACTA ATGACCCCGT AATTGATTAC TATTAATAAC TAGTCAATAA 
13980 

TCAATGTCAA GGCGTATATC TGGCCCGTAC ATCGCGAAGC AGCGCAAAAC GCCTAACCCT 
14040 

AAGCAGATTC TTCATGCAAT TGTCGGTCAA GCCTTGCCTT GTTGTAGCTT AAATTTTGCT 
14100 

CGCGCACTAC TCAGCGACCT GCAACACAGA AGCAGGGAGG AGATACTGGC TTAACTATGC 
14160 
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GGCATCAGAG CAGATTGTAC TGAGAGTCGA CCATAGGGGA TCGGGAGATC TCCCGATCCG 
14220 

TCTATGGTGC ACTCTCAGTA CAATCTGCTC TGATGCCGCA TAGTTAAGCC AGTATACACT 
14280 

CCGCTATCGC TACGTGACTG GGTCATGGCT GCGCCCCGAC ACCCGCCAAC ACCCGCTGAC 
14340 

GCGCCCTGAC GGGCTTGTCT GCTCCCGGCA TCCGCTTACA GACAAGCTGT GACCGTCT.CC 
14400 

GGGAGCTGCA TGTGTCAGAG GTTTTCACCG TCATCACCGA AACGCGCGAG GCAGC 
14455 

(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 10610 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

GACGGATCGG GAGATCCGCG CGGTACACAG AATTCAGGAG ACACAACTCC AAGTGCATAC 
60 

TCTATGTCAT TTTCATGGGA CTGGTCTGGC CACAACTACA TTAATGAAAT ATTTGCCACA 
120 
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TCCTCTTACA CTTTTTCATA CATTGGGGAA GAATAAAGAA ™«™" 
180 

CGTGTTTATT TTTCAATTGC AGAAAATTTC AAGTCATTTT TCATTCAGTA G T A T A=GGGC 
240 

^nv-*rrr.TA CCTTAATCAA ACTCACAGAA CCCTAGTATT 
ACCACCACAT AGCTTATACA GATCACCGTA CCTTAAT 

300 

CA ACC TG CCA CGTGGCCGG AACACACAGA GTAGAGAG.G CTTTCTCCCC GGCTGGCCTT 
360 

AAAAAGCATC ATATCATGGG TAACAGACAT ATTCTTAGGT GTTATATTCC ACACGGTTTC 
420 

CTGTCGAGCC AAACGCTCAT CAGTGATATT AATAAACTCC CCGGGCAGCT CACTTAAGTT 
480 

CATGTCGCTG TCCAGCTGCT GAGCCACAGG CTGCTGTCCA ACTTGCGGTT GCTTAACGGG 
540 

CGGCGAAGGA GAAGTCCACG CCTACATGGG GGTAGAGTCA TAATCGTGCA TCAGGATAGG 
600 

GCGGTGGTGC TGCAGCAGCG CGCGAATAAA CTGCTGCCGC CGCCGCTCCG TGGTGCAGGA 
660 

ATACAACATG GCAGTGGTCT CCTCAGCGAT GATTCGCACC GCCCGCAGCA TAAGGCGCCT 
720 

TGTCCTCCGG GCACAGCAGC GGAGGGTGAT CTCACTTAAA TCAGCACAGT AACTGCAGCA 
780 

CAGGAGCAGA ATATTGTTCA AAATCCCACA GTGCAAGGCG CTGTATCCAA AGCTCATGGC 
840 

GGGGACCACA GAACCCACGT GGCCATCATA GCAGAAGCGG AGGTAGATTA AGTGGCGACC 
900 
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CCTCATAAAC ACGCTGGACA TAAACATTAC 
960 

CCGGTACCAT ATAAACCTCT GATTAAACAT 
1020 

GGCCAAAACC TGCCCGCCGG CTATACACTG 
1080 

GAGAGCCCAG GACTCGTAAC CATGGATCAT 
1140 

ACACAGGCAC ACGTGCATAC ACTTCCTCAG 
1200 

ATCCCAGGGA ACAACCCATT CCTGAATCAG 
1260 

CACGTAACTC ACGTTGTGCA TTGTCAAAGT 
1320 

CAGTATGGTA GCGCGGGTTT CTGTCTCAAA 
1380 ' 

GCGCCGAGAC AACCGAGATC GTGTTGGTCG 
1440 

AGTCATATTT CCTGAAGCAA AACCAGGTGC 
1500 

CTCGCCGCTT AGATCGCTCT GTGTAGTAGT 
1560 

GGCGCCCCCT GGCTTCGGGT TCTATGTAAA 
1620 

CCACCACCGC AGAATAAGCC ACACCCAGCC 
1680 



CTCTTTTGGC ATGTTGTAAT TCACCACCTC 



GGCGCCATCC ACCACCATCC TAAACCAGCT 



CAGGGAACCG GGACTGGAAC AATGACAGTG 



CATGCTCGTC ATGATATCAA TGTTGGCACA 



GATTACAAGC TCCTCCCGCG TTAGAACCAT 



CGTAAATCCC ACACTGCAGG GAAGACCTCG 



GTTACATTCG GGCAGCAGCG GATGATCCTC 



AGGAGGTAGA CGATCCCTAC TGTACGGAGT 



TAGTGTCATG CCAAATGGAA CGCCGGACGT 



GGGCGTGACA AACAGATCTG CGTCTCCGGT 



TGTAGTATAT CCACTCTCTC AAAGCATCCA 



CTCCTTCATG CGCCGCTGCC CTGATAACAT 



AACCTACACA TTCGTTCTGC GAGTCACACA 
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cgggaggagg gggaagagct «~ »» — mT TTTA " KAi 

1740 

, mr , ^rrrrrTCCC CTCCGGTGGC GTGGTCAAAC 
A^CCTCAA AATGAAGATC TATTAAGTGA ACGCGCTCCC 

1800 

TCTACAGCCA AAGAACAGAT AATGGCATTT ««« ^AATGGC TTCCAAAAGG 
I860 

_ _ « .rrrTAAACC CTTCAGGGTG AATCTCCTCT 
CAAACGGCCC TCACGTCCAA GTGGACGTAA AGGCTAAACC CT1 

1920 

, T AAACATTC CAGCACCTTC AACCATGCGC AAATAATTCT GATCTGGGGA GCTTCTCAAT 
1980 

ATATCTCTAA GGAAATCGCG AATATTAAGT TAAAAATCTG CTGCAGAGGG 

2040 

CCCTCCACCT TCAGCCTCAA GCAGCCAATC ATGATTGCAA AAATTCAGGT TCCTCACAGA 
2100 

CCTGTATAAG ATTCAAAAGC GGAACATTAA GAAAAATACC GCGATCCCGT AGGTCCCTTC 
2160 

TCGTGCAGGT CTGCACGGAC CAGCGCGGCC ACTTCCCCGC 



GCAGGGCCAG CTGAACATAA 
2220 



CCCACACTGA TTATGACACG CATACTCGGA GCTATGCTAA 



CAGGAACCTT GACAAAAGAA 
2280 

CCAGCGTAGC CCCGATGTAA GCTTTGTTGC ATGGGCGGCG ATATAAAATG CAAGGTGCTG 
2340 

CTCAAAAAAT GAGGCAAAGC CTCGCGCAAA AAAGAAAGGA CATCGTAGTC ATGCTCATGC 
2400 

AGATAAAGGC AGGTAAGCTC CGGAACCACC AGAGAAAAAG ACACCATTTT TCTCTCAAAC 
2460 
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ATGTCTGCGG GTTTCTGCAT AAACACAAAA TAAAATAACA AAAAAACATT TAAACATTAG 
2520 

AAGCCTGTCT TACAACAGGA AAAACAACCC TTATAAGCAT AAGACGGACT ACGGCCATGC 
2580 

CGGCGTGACC GTAAAAAAAC TGGTCACCGT GATTAAAAAG CACCACCGAC AGCTCCTCGG 
2640 

TCATGTCCGG AGTCATAATG TAAGACTCGG TAAACACATC AGGTTGATTC ATCGGTCAGT 
2700 

GCTAAAAAGC GACCGAAATA GCCCGGGGGA ATACATACCC GCAGGCGTAG AGACAACATT 
2760 

ACAGCCCCCA TAGGAGGTAT AACAAAATTA ATAGGAGAGA AAAACACATA AACACCTGAA 
2820 

AAACCCTCCT GCCTAGGCAA AATAGCACCC TCCCGCTCCA GAACAACATA CAGCGCTTCA 
2880 

CAGCGGCAGC CTAACAGTCA GCCTTACCAG TAAAAAAGAA AACCTATTAA AAAAACACCA 
2940 

CTCGACACGG CACCAGCTCA ATCAGTCACA GTGTAAAAAA GGGCCAAGTG CAGAGCGAGT 
3000 

ATATATAGGA CTAAAAAATG ACGTAACGGT TAAAGTCCAC AAAAAACACC CAGAAAACCG 

3060 

CACGCGAACC TACGCCCAGA AACGAAAGCC AAAAAACCCA CAACTTCCTC AAATCGTCAC 
3120 

TTCCGTTTTC CCACGTTACG TAACTTCCCG GATCCTCTCC CGATCCCCTA TGGTCGACTC 
3180 

TCAGTACAAT CTGCTCTGAT GCCGCATAGT TAAGCCAGTA TCTGCTCCCT GCTTGTGTGT 
3240 
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„ rrp . rrAm -r T TAAGCTAC AACAAGGCAA GGCTTGACCG 
TGGAGGTCGC TGAGTAGTGC GCGAGCAAAA . TTAAo 

3300 

ACAATTGCAT GAAGAATCTG CTTAGGGTTA GGCGTTTTGC GCTGCTTCGC GATGTACGGG 
3360 

CGAGATATAG GCGTTGACAT TGATTATTGA CTAGTTATTA ATAGTAATCA ATTACGGGGT 
3420 

CATTAGTTCA TAGCCCATAT ATGGAGTTCC GCGTTACATA ACTTACGGTA AATGGCCCGC 
3480 

CTGGCTGACC GCCGAAGGAC GGCCGCGCAT TGACGTCAAT AATGACGTAT GTTGGGATAG 
3540 

TAACGCCAAT AGGGACTTTC CATTGACGTC AATGGGTGGA CTATTTACGG TAAACTGCCC 
3600 

ACTTGGCAGT ACATCAAGTG TATCATATGC CAAGTACGCC CCCTATTGAC GTCAATGACG 
3660 

GTAAATGGCC CGCCTGGCAT TATGCCCAGT ACATGACCTT ATGGGACTTT CCTACTTGGC 
3720 

AGTACATCTA CGTATTAGTC ATCGCTATTA CCATGGTGAT GCGGTTTTGG CAGTACATCA 
3780 

ATGGGCGTGG ATAGCGGTTT GACTCACGGG GATTTCCAAG TCTCCACCCC ATTGACGTCA 
3840 

ATGGGAGTTT GTTTTGGCAC CAAAATCAAC GGGACTTTCC AAAATGTCGT AACAACTCCG 
3900 

CCCCATTGAC GCAAATGGGC GGTAGGCGTG TACGGTGGGA GGTCTATATA AGCAGAGCTC 
3960 

^ n , r ^ rTTACT GGCTTATCGA AATTAATACG ACTCACTATA 
TCTGGCTAAC TAGAGAACCC ACT^CTTACT GCLliAi^ 

4020 
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GGGAGACCCA AGCTTGGTAC CGAGCTCGGA 
4080 

GTTGAGGACA AACTCTTCGC GGTCTTTCCA 
4140 

CGAACGGTAC TCCGCCACCG AGGGACCTGA 
4200 . 

TCTCGAGAAA GGCGTCTAAC CAGTCACAGT 
4260 

GCAGCGGGTG GCGGTCGGGG TTGTTTCTGG 
4320 

■ . - 

AGGCGGTCTT GAGACGGCGG ATGGTCGAGG 
4380 

AAGCGCGCAA GACCGTCTGA AGATACCTTC 
4440 

GGTCCTCCAA CTGTGCCTTT TCTTACTCCT 
4500 

AGTCCCCCTG GGGTACTCTC TTTGCGCCTA 
4560 

CTTGCGCTCA AAATGGGCAA CGGCCTCTCT 
4620 

AATGTAACCA CTGTGAGCCC ACCTCTCAAA 
4680 

TCTGCACCCC TCACAGTTAC CTCAGAAGCC 
4740 

GTCGCGGGCA ACACACTCAC CATGCAATCA 
4800 



TCTGAATTCG AGCTCGCTGT TGGGCTCGCG 



GTACTCTTGG ATCGGAAACC CGTCGGCCTC 



GCGAGTCCGC ATCGACCGGA TCGGAAAACC 



CGCAAGGTAG GCTGAGCACC GTGGCGGGCG 



CGGAGGTGCT GCTGATGATG TAATTAAAGT 



TGAGGTGTGG CAGGCTTGAG ATCCAAGATG 



AACCCCGTGT ATCCATATGA CACGGAAACC 



CCCTTTGTAT CCCCCAATGG GTTTCAAGAG 



TCCGAACCTC TAGTTACCTC CAATGGCATG 



CTGGACGAGG CCGGCAACCT TACCTCCCAA 



AAAACCAAGT CAAACATAAA CCTGGAAATA 



CTAACTGTGG CTGCCGCCGC ACCTCTAATG 



CAGGCCCCGC TAACCGTGCA CGACTCCAAA 
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cttagcattg cgagggaagg aggggtgaga gtgtgagaag gaaaggtagg gctggaaaca 

4860 

t gagggcggg tcacgaccac ggatagcagt agggttagta tcactgggtc aggggctcta 

4920 

agtagtggga gtggtaggtt ggggattgag ^gaaagagg ggatttatag agaaaatgga 

4980 

>™rrr GGCTCC^TTG CATGTAACAG ACGACCTAAA CACTTTGACC 
AAACTAGGAC TAAAGTACGG GGCTCU-H- ^ 

5040 

gtagcaactg gtggaggtgt gagtattaat aatagttggt tgcaaagtaa agttactgga 

5100 

GCCTTGGGTT TTGATTCACA AGGCAATATG CAACTTAATG TAGCAGGAGG ACTAAGGATT 
5160 

GATTCTCAAA ACAGACGCCT TATACTTGAT GTTAGTTATC CGTTTGATGC TCAAAACCAA 
5220 

rrrrrcTCTT TTTATAAACT CAGCCCACAA CTTGGATATT 
CTAAATCTAA GACTAGGACA GGGCCCTCTI iiiahmv* 

5280 

aagtagaaga aaggggttta gttgtttaga ggttgaaaga attggaaaaa ggttgaggtt 

5340 

aaggtaagga gtgggaaggg gttgatgttt gagggtagag ggatagggat taatggagga 

5400 

gatggggttg aatttggttg aggtaatgga ggaaagagaa atggggtgaa aagaaaaatt 

5460 

GGCCATGGCC TAGAATTTGA TTCAAACAAG GCTATGGTTC CTAAACTAGG AACTGGCCTT 
5520 

AGTTTTGACA GCACAGGTGC CATTACAGTA GGAAACAAAA ATAATGATAA GCTAACTTTG 
5580 
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TGGACCACAC CAGCTCCATC TCCTAACTGT AGACTAAATG CAGAGAAAGA TGCTAAACTC 
5640 

ACTTTGGTCT TAACAAAATG TGGCAGTCAA ATACTTGCTA CAGTTTCAGT TTTGGCTGTT 
5700 

AAAGGCAGTT TGGCTCCAAT ATCTGGAACA GTTCAAAGTG CTCATCTTAT TATAAGATTT 
5760 

GACGAAAATG GAGTGCTACT AAACAATTCC TTCCTGGACC CAGAATATTG GAACTTTAGA 
5820 

» 

AATGGAGATC TTACTGAAGG CACAGCCTAT ACAAACGCTG TTGGATTTAT GCCTAACCTA 
5880 

TCAGCTTATC CAAAATCTCA CGGTAAAACT GCCAAAAGTA ACATTGTCAG TCAAGTTTAC 
5940 

r 

TTAAACGGAG ACAAAACTAA ACCTGTAACA CTAACCATTA CACTAAACGG TACACAGGAA 
6000 

ACAGGAGACA CAACTCCAAG TGCATACTCT ATGTCATTTT CATGGGACTG GTCTGGCCAC 
6060 

AACTACATTA ATGAAATATT TGCCACATCC TCTTACACTT TTTCATACAT TGCCCAAGAA 
6120 

TAAAGAAGCG GCCGCTCGAG CATGCATCTA GAGGGCCCTA TTCTATAGTG TCACCTAAAT 
6180 

GCTAGAGCTC GCTGATCAGC CTCGACTGTG CCTTCTAGTT GCCAGCCATC TGTTGTTTGC 
6240 

CCCTCCCCCG TGCCTTCCTT GACCCTGGAA GGTGCCACTC CCACTGTCCT TTCCTAATAA 
6300 



AATGAGGAAA TTGCATCGCA TTGTCTGAGT AGGTGTCATT CTATTCTGGG GGGTGGGGTG 
6360 
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KS CAGGAGA » — GGCATGCTGG CGATG^TG 

6420 

GGCTCTATGG CTTCTGAGGC GGAAAGAACC AGCTGGGGCT CTAGGGGGTA TCCCCACGCG 
6480 

„ , rr ,rrrrGT GTGGTGGTTA CGCGCAGCGT GACCGCTACA 
CCCTGTAGCG GCGCATTAAG CGCGGCGGGT BTOBlWii 

6540 

CTT GCCAGCG CCCTAGCGCC CGCTCCTTTC GCTTTCTTCC CTTCCTTTCT CGCCACGTTC 
6600 

m^r- rrrzvTCCCTT TAGGGTTCCG ATTTAGTGCT 
GCCGGCTTTC CCCGTCAAGC TCTAAATCGG GGCATCCCTT TA 

6660 

TTACGGCACC TCGACCCCAA AAAACTTGAT TAGGGTGATG GTTCACGTAG TGGGCCATCG 
6720 

CCCTGATAGA CGGTTTTTCG CCCTTTGACG TTGGAGTCCA CGTTCTTTAA TAGTGGACTC 
6780 

TTGTTCCAAA CTGGAACAAC ACTCAAGCGT ATCTCGGTCT ATTCTTTTGA TTTATAAGGG 
6840 

ATTTTGGGGA TTTCGGCCTA TTGGTTAAAA AATGAGCTGA TTTAACAAAA ATTTAACGCG 
6900 

AATTAATTCT GTGGAATGTG TGTCAGTTAG GGTGTGGAAA GTCCCCAGGC TCCCCAGGCA 
6960 

GCATCTCAAT TAGTCAGCAA CCAGGTGTGG AAAGTCCCCA 



GGCAGAAGTA TGCAAAGCAT 
7020 

GGCTCCCCAG CAGGCAGAAG 
7080 



TATGCAAAGC ATGCATCTCA ATTAGTCAGC AACCATAGTC 



CCGCCCCTAA CTCCGCCCAT CCCGCCCCTA ACTCCGCCCA GTTCCGCCCA TTCTCCGCCC 



7140 
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CATGGCTGAC TAATTTTTTT 
7200 

TTCCAGAAGT AGTGAGGAGG 
7260 

GCTTGTATAT CCATTTTCGG 
7320 

GAACAAGATG GATTGCACGC 
7380 

GACTGGGCAC AACAGACAAT 
7440 

GGGCGCCCGG TTCTTTTTGT 
7500 

GAGGCAGCGC GGCTATCGTG 
7560 

GTTGTCACTG AAGCGGGAAG 
7620 

CTGTCATCTC ACCTTGCTCC 
7680 

CTGCATACGC TTGATCCGGC 
7740 

CGAGCACGTA CTCGGATGGA 
7800 

CAGGGGCTCG CGCCAGCCGA 
7860 

GATCTCGTCG TGACCCATGG 
7920 
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TATTTATGCA GAGGCCGAGG 



CTTTTTTGGA GGCCTAGGCT 



ATCTGATCAA GAGACAGGAT 



AGGTTCTCCG GCCGCTTGGG 



CGGCTGCTCT GATGCCGCCG 



CAAGACCGAC CTGTCCGGTG 



GCTGGCCACG ACGGGCGTTC 



GGACTGGCTG CTATTGGGCG 



TGCCGAGAAA GTATCCATCA 



TACCTGCCCA TTCGACCACC 



AGCCGGTCTT GTCGATCAGG 



ACTGTTCGCC AGGCTCAAGG 



CGATGCCTGC TTGCCGAATA 
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CCGCCTCTGC CTCTGAGCTA 



TTTGCAAAAA GCTCCCGGGA 



GAGGATCGTT TCGCATGATT 



TGGAGAGGCT ATTCGGCTAT 



TGTTCCGGCT GTCAGCGCAG 



CCCTGAATGA ACTGCAGGAC 



CTTGCGCAGC TGTGCTCGAC 



AAGTGCCGGG GCAGGATCTC 



TGGCTGATGC AATGCGGCGG 



AAGCGAAACA TCGCATCGAG 



ATGATCTGGA CGAAGAGCAT 



CGCGCATGCC CGACGGCGAG 



TCATGGTGGA AAATGGCCGC 
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TTTTCTGGAT TCATCGACTG TGGCCGGCTC GGTGTGGCGG ACCGCTATCA GGACATAGCC 

ILaCCC GTGATATTGC TGAAGAGCTT GGCGGCGAAT GGGCTGACCG CTTCCTCGTG 
8040 

C1TTACGGTA TCGCCGCTCC CGATTCGCAG CGCATCGCCT TCTATCGCCT TCTTGACGAG 
8100 

m^ArrrarrA AGCGACGCCC AACCTGCCAT 
TTCTTCTGAG CGGGACTCTG GGGTTCGAAA TGACCGACCA AGCGAC 

8160 

CACGAGATTT CGA.TCCAGC GCCGCCTTCT ATGAAAGGTT GGGCTTCGGA ATCGTTTTCC 
8220 

GGGAGGGGGG CTGGATGATC CTCCAGCGCG GGGATCTCAT GCTGGAGTTC TTCGCCCACC 
8280 

CCAACTTGTT TATTGCAGCT TATAATGGTT ACAAATAAAG GAATAGGATC ACAAATTTCA 
8340 

CAAATAAAGC ATTTTTTTCA CTGCATTCTA GTTGTGGTTT GTCCAAACTC ATCAATGTAT 
8400 

r-nr Trr&CCTCTA GCTAGAGCTT GGCGTAATCA TGGTCATAGC 
CTTATCATGT CTGTATACCG TCGACCTCTA 

8460 

TGTTTCCTGT GTGAAATTGT TATGCGGTCA CAATTCCACA CAACATACGA GCCGGAAGCA 
8520 

TAAAGTGTAA AGCCTGGGGT GCCTAATGAG TGAGCTAACT CACATTAATT GCGTTGCGCT 
8580 

CACTGCGCGC TTTCCAGTCG GGAAACCTGT CGTGCCAGCT GCATTAATGA ATCGGCCAAC 
8640 

GCGCGGGGAG AGGCGGTTTG CGTATTGGGC GCTC^GCGC TTCCTCGCC ACTGACTCGC 
8700 



WO 98/13499 



- 143 - 



PCT/EP97/05251 



TGCGCTCGGT CGTTCGGCTG 
8760 

TATCCACAGA ATCAGGGGAT 
8820 

CCAGGAACCG TAAAAAGGCC 
8880 

AGCATCACAA AAATCGACGC 
8940 

ACCAGGCGTT TCCCCCTGGA 
9000 

CCGGATACCT GTCCGCCTTT 
9060 

GTAGGTATCT CAGTTCGGTG 
9120 

CCGTTCAGCC CGACCGCTGC 
9180 

GACACGACTT ATCGCCACTG 
9240 

TAGGCGGTGC TACAGAGTTC 
9300 

TATTTGGTAT CTGCGCTCTG 
9360 

GATCCGGCAA ACAAACCACC 
9420 

CGCGCAGAAA AAAAGGATCT 
9480 



CGGCGAGCGG TATCAGCTCA 



AACGCAGGAA AGAACATGTG 



GCGTTGCTGG CGTTTTTCCA 



TCAAGTCAGA GGTGGCGAAA 



AGCTCCCTCG TGCGCTCTCC 



CTCCCTTCGG GAAGCGTGGC 



TAGGTCGTTC GCTCCAAGCT 



GCCTTATCCG GTAACTATCG 



GCAGCAGCCA CTGGTAACAG 



TTGAAGTGGT GGCCTAACTA 



CTGAAGCCAG TTACCTTCGG 



GCTGGTAGCG GTGGTTTTTT 



CAAGAAGATC CTTTGATCTT 



CTCAAAGGCG GTAATACGGT 



AGCAAAAGGC CAGCAAAAGG 



TAGGCTCCGC. CCCCCTGACG 



CCCGACAGGA CTATAAAGAT 



TGTTCCGACC CTGCCGCTTA 



GCTTTCTCAA TGCTCACGCT 



GGGCTGTGTG CACGAACCCC 



TCTTGAGTCC AACCCGGTAA 



GATTAGCAGA GCGAGGTATG 



CGGCTACACT AGAAGGACAG 



AAAAAGAGTT GGTAGCTCTT 



TGTTTGCAAG CAGCAGATTA 



TTCTACGGGG TCTGACGCTC 
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-n n^rGGATTT TGGTCATGAG ATTATCAAAA AGGATCTTCA 
AGTGGAACGA AAACTCACGT TAAGGGATTT TGG 

9540 

^ AAATGAAGTT TTAAATCAAT CTAAAGTATA TATGAGTAAA 
CCTAGATCCT TTTAAATTAA AAATGAAGTT 

9600 

nmmrprarr TATCTCAGCG ATCTGTCTAT 
CTTGGTCTGA CAGTTACCAA TGCTTAATCA GTGAGGCACC TATCTC 

9660 

TTCGTTCATC CATAG^CG TGAG.GGCGG TGGTG.AGA, AACTAGGA.A —«« 
9720 

o*™r rr rGAGACCC ACGCTCACCG GCTCCAGATT 
TACCATCTGG CCCCAGTGCT GCAATGATAC CGCGAGACCC AC 

9780 

TATCAGCAAT AAACCACCCA 
9840 

rr&GTCTATT AATTGTTGCC GGGAAGCTAG AGTAAGTAGT TCGCCAGTTA 
CCGCCTCCAT CCAGTCTATT Miiba 

9900 

„OCG CAACGTTGTT CGCAT^TA GGTGTCACGC -TCG^ 

9960 

GTATGGCTTC ATTGAGC.GC GGTTCCCAAC GATCAAGGCG AGTTACATGA TCCCCCATGT 
10020 

TGTGCAAAAA AGCGGTTAGC TGGTTGGGTG C.GCGA.CG, TGTCAGAAGT AAGTTGGCCG 
10080 

CAGTGTTATC ACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC ATGCCATCCG 
10140 

~ m r?*rmCTC ATTCTGAGAA TAGTGTATGC 
TAAGATGCTT TTCTGTGACT GGTGAGTACT CAACCAAGTC 

10200 

,^rr rCGGCGTCAA TACGGGATAA TACCGCGCCA CATAGCAGAA 
GGCGACCGAG TTGCTCTTGC CCGGCGTCAA 

10260 
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CTTTAAAAGT GCTCATCATT GGAAAACGTT CTTCGGGGCG AAAACTCTCA AGGATCTTAC 
10320 

CGCTGTTGAG ATCCAGTTCG ATGTAACCCA CTCGTGCACC CAACTGATCT TCAGCATCTT 
10380 

TTACTTTCAC CAGCGTTTCT GGGTGAGCAA AAACAGGAAG GCAAAATGCC GCAAAAAAGG 
10440 

GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT CCTTTTTCAA TATTATTGAA 
10500 

GCATTTATCA GGGTTATTGT CTCATGAGCG GATACATATT TGAATGTATT TAGAAAAATA 
10560 

AACAAATAGG GGTTCCGCGC ACATTTCCCC GAAAAGTGCC ACCTGACGTC 
10610 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:. 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



TGTACACCGG ATCCGGCGCA CACC24 
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(2 ) INFORMATION FOR SEQ ID NO : 1 o : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CACAACGAGC TCAATTAATT AATTGCCACA TCCTC 



35 
G 



(2 ) INFORMATION FOR SEQ ID NO : 19 : 

(i ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal 



i) SEQUENCE DESCRIPTION: SEQ ID NO: 19 



(xi 



Thr Leu Trp Thr 



WO 98/13499 



147 



PCT/EP97/05251 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



Pro Ser Ala Ser Ala Ser Ala Ser Ala Pro Gly Ser 
1 5 10 
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* e ''Toeing cell line expressing one or mo* adenovirus peroral proteins. 

• ;C P ; e , or fragm - — . — ■ - — - - — 4 from *• 

group consisting of: 

a . penton base; 

b. hexon; 

c. fiber; 

d. polypeptide Ilia; 

e. polypeptide V; 

f . polypeptide VI; 

g. polypeptide VII; 

h. polypeptide Vffl; and 

i biologically active fragments thereof. 
2 . A pacing cell line according .0 Cairn 1 . which suppons ft. producuon of a v,ral 

vector. 

A packaging cell line according .o claim 2. wherein rhe viral vecror compnses a 
rripanrre leader sequence or a sequence suhstannally homologous rher«o. 
A pacing eel, line according ,o an, one of the preceding claims wherem sard 
srrucuaral prorein is fiber and wherein said fiber proton, has been modtfied ,0 tnc ude 
„on-n».ive amino acid residue sequence which rargers a specific reccpror, bur whrch 
does nor disrupt ftimerformarion or rranspon of fiber inrorhe nucleus. 
A packaging eel, line according .0 claim 4 wherein said non-native amino acd resrdue 
sequence alrcrs the binding specificity of the fiber for a targeted cell type. 
A packaging coll line according to. any one of the preceding claims wherein sard 
smrcural protein is fiher compnsing amino acid residue sequences from more man one 
adenovirus serotype. 

7 A packaging eel, line according to any one of claims 2 to 6, wherein satd v,ral vector 
inClU des a nucleic acid sequence hav.ng a deletion or mutation of a DNA sequence 
encoding an adenovirus structural protein, polypeptide, or fragment thereof. 



3. 



4. 



5. 



6. 
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8. A packaging cell line according to claim 7 wherein said viral vector includes a nucleic 
acid sequence having a deletion or mutation of the DNA sequences encoding 
regulatory polypeptides El A and E1B. 

9. A packaging cell line according to claim 7, wherein said viral vector further includes a 
nucleic acid sequence having a deletion or mutation of a DNA sequence encoding one 
or more of the following regulatory proteins or polypeptides: E2A, E2B t E3, E4, L4, 

. or fragments thereof. 

10. A packaging cell line according to claim 7, wherein a foreign DNA sequence encoding 
one or more foreign proteins, polypeptides or fragments thereof has been inserted in 
place of any of said deletions in said therapeutic viral vector. 

11. A packaging cell line according to claim 10, wherein said foreign DNA encodes a 
tumor-suppressor protein or a biologically active fragment thereof. 

12. A packaging cell line according to claim 1 0, wherein said foreign DNA encodes a 
suicide protein or a biologically active fragment thereof. 

13. A packaging cell line according to any one of the preceding claims, wherein said cell 
line is an epithelial cell line. 

14. A packaging cell line according to claim 12, wherein said cell line is selected from the 
group consisting of 293, A549, W162, HeLa, Vero, 21 1, and 21 1 A cell lines. 

15. A viral vector comprising deletion or mutation of a DNA sequence encoding an 
adenovirus structural protein, polypeptide, or fragment thereof, wherein said structural 
protein, polypeptide or fragment thereof is selected from the group consisting of: 

a. pen ton base; 

b. hexon; 

c. fiber; 

d. polypeptide Ilia; 

e. polypeptide V; 

f. polypeptide VI; 

g. polypeptide VII; 

h. polypeptide VIE; and 

i. biologically active fragments thereof. 
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16. 



17. 



*n B to claim 15 further comprising deletion or mutation of the 
A viral vector according to claim 13, r fraEme nts thereof . 

DNA sequences encoding regulatory polypeptides El A and E1B. or fragme 
Z - vector according to Cm 15, further comprising deletion or mulatto of the 
DN A sequence encoding one or more of the following regulatory proteins or 
oolvneptides: E2A. E2B. E3, E4, L4. or fragments thereof. 

lg — -*»^^'^^TIL or 

• led in place of the DNA sequence encoding saM structural protein, polype. 

19 rr:ridin g ^ o ne of * w or «. — 

DN A sequences inserted in place of the DNA sequences encoding said regulatory 



20. 



polypeptides or fragments thereof. 

A vira , vee.or iachin* * or pan of a DNA seooenee encodin S adenovirus fiber pro^n. 
Ill said DNA sconce has been repiaoed b, a fore ig n DNA sconce eneod,n g » 

21 rrirt:: . — — — — — r "t" 

l 1 L veo,or — ,0 any one of ,5 .0 2> wherein .he vecror is a rnerapenuc 
23 nliernenfing piasnod conapnsrn, a promorer nucieoride sequence operand 
or fragment thereof. 

M . A coIp,en.en,in g plasnud accord,„ g ,0 Cairn tt funher conrprisin, a „uc,ec,de 
sequence encoding: 

a a first adenov.rus regulatory protein, polypeptide, or fragment thereof, or 
b a second regulatory protein, polypeptide, or fragment thereof; or 
c . a third regulatory protein, polypeptide, or fragment thereof; or 
d any combination of the foregoing. 
25 A conrpiemennn, Piaamid — ,0 cfcim «, wherein said adenovirus srrucnam, 
protein or polypepude is se.ec.ed from .he r oup cona.sr.ng of: 
a . penton base; 
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. b. hexon; 

c. fiber; 

d. polypeptide Ilia; 

e. polypeptide V; 

f. polypeptide VI; 

g. polypeptide VII; 

h. polypeptide VIII; and 

i. biologically active fragments thereof. 

26. A complementing plasmid according to claim 24, wherein said regulatory proteins, 
polypeptides or fragments thereof are selected from the group consisting of El A, E1B, 

E2A, E2B, E3, E4, and L4. 

27. A complementing plasmid comprising a promoter nucleotide sequence operatively 
linked to a nucleotide sequence encoding an adenovirus regulatory protein, 
polypeptide, or fragment thereof. 

28. A composition useful in the preparation of a therapeutic viral vectors, the composition 
comprising a cell containing a delivery plasmid comprising an adenovirus genome 
lacking a nucleotide sequence encoding fiber. 

29. A composition according to claim 28, wherein said delivery plasmid further comprises a 
nucleotide sequence encoding a foreign polypeptide. 

30. A composition according to claim 29, wherein said polypeptide is a therapeutic 

molecule. 

31 . A composition according to claim 28, wherein said delivery plasmid is selected from 
the group consisting of pDVl, p E1B gal, p ElsplB, and pFG140-f. 

32. A composition according to claim 28, wherein said cell further comprises a 
complementing plasmid containing a nucleotide sequence encoding fiber, said plasmid 
being stably integrated into the cellular genome of said cell. 

33. A composition according to claim 32, wherein said complementing plasmid has the 
characteristics of pCLF having ATCC Accession Number 97737. 

34. A composition useful in the preparation of therapeutic viral vectors, said composition 
comprising a cell containing: 
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. frrs. derive,, *-* compdahrg . — s genome .acting a nucleotide 
a. a tirsi aenvciy F panicles in the 

seqU enc C encoding fiber and incapable of direct the pacing of new P 

absence of a second delivery plasmid; and ^ ^ ^ 

b a second delivery plasmid compnsmg an adenov ir al gen P 

. • ,1 narticles in the presence of said first delivery plasmid. 
me pacing of new viral P- l ^ ^ ^ delivery plasmjds 

15 A composition according to claim 

T7 a composition according to claim jh, 

nrtoride sequence encoding a foreign polypeptide. 

^on acting -0 * ** - « ^ """" ^ 

— ^ Ein8 d .:r:i::Tw^» - — «-* — - 

A composition according to claim 



38. 

molecule. 

39. 



40. 

UcZ reponer consm*.. ^ 
4, A composition according to claim 34, wherein sal 

^ anuCeoride sconce encoding an a— reguUo* *~ 

* J Miotnrv nrotein 1 

42. 
43. 



.acVts a nuclide sequence encoding an adenovirus ^, 
Imposition according ,0 Cairn 4i. wherein said reguialor, prorein .». 
A common -ding to Cairn 3d. wherein said compi—g piasmrd has Una 

n f r.n F having ATCC Accession Number 9773 /. 
cn aractenstics of P CLF havi ^ ^ ^ ^ a 

w . A composition according to claim 34, 

nucleotide sequence encoding adenovirus E4 protein an 
lack s a nucleotide sequence encoding adenovirus El protein. 
45 A composition according to claim 44, wnerein said cell contains at « ^ 

commenting plasmid encoding an adenovirai regulatory protein and a structural 

Pr0tdn ' „■ c ,o claim 45 wherein said regulatory protein is E4 and said 

46. A composition according to claim 45, wnere 

structural protein is fiber. 
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47. A composition according to claim 45, wherein said regulatory protein is El and said 
structural protein is fiber. 

48. A composition according to claim 45, wherein said regulatory protein is both El and 
E4 and said structural protein is fiber. 

49. A composition according to claim 45, wherein said adenoviral regulatory protein and 
said structural protein are encoded by separate complementing plasmids. 

50. A composition according to claim 45, wherein said cell is selected from the group 
consisting of 293, A549, W162, HeLa, Vero, 21 1, and 21 1 A. 
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